Semantic Abstract Graphs in Scala

Krzysztof Borowski

Simple example

//File A.scala
package com.virtuslab.semanticgraphs
class A(a: String)
//File B.scala
package com.virtuslab.semanticgraphs
class B(b: String, a: A)

Abstract Syntax Tree (AST)

 a tree representation of the abstract syntactic structure of source code

Represents the exact code structure for given code fragment 

Scala Trees (scalameta)

What about semantic data?

The tree has only information about the code and structure in particular file. 

 

Can we know more about semantic data?

  • Where is a definition of the class instantiated in the file?
  • How can we easily gather and present information about types, scopes, modifiers etc.?

SemanticDB

  • the standard way of representing semantic data
  • contains symbols (definitions) and occurrences (symbols' references)
  • unique stable symbol identifiers
  • the exact location of each symbol and occurrence

SemanticDB - example

A.scala
-------

Summary:
Schema => SemanticDB v4
Uri => A.scala
Text => empty
Language => Scala
Symbols => 4 entries
Occurrences => 7 entries

Symbols:
org/virtuslab/semanticgraphs/A# => class A extends AnyRef { +2 decls }
org/virtuslab/semanticgraphs/A#`<init>`(). => primary ctor <init>(a: String)
org/virtuslab/semanticgraphs/A#`<init>`().(a) => param a: String
org/virtuslab/semanticgraphs/A#a. => private[this] val method a: String

Occurrences:
[0:8..0:11) => org/
[0:12..0:21) => org/virtuslab/
[0:22..0:36) => org/virtuslab/semanticgraphs/
[1:6..1:7) <= org/virtuslab/semanticgraphs/A#
[1:7..1:7) <= org/virtuslab/semanticgraphs/A#`<init>`().
[1:8..1:9) <= org/virtuslab/semanticgraphs/A#a.
[1:11..1:17) => scala/Predef.String#

Scala Trees + SemanticDB

  • Scala Trees: information about the code structure
  • SemanticDB:
    • Unique stable symbols across the source code
    • Additional semantic data (signatures, types, modifiers, return types and more)

Scala Trees + SemanticDB

Abstract Semantic Graph

  • is a form of abstract syntax in which an expression of a programming language is represented by a graph
  •  a higher level of abstraction than an abstract syntax tree (or AST)

Graph Indexer

  • Generate SemanticDB files
  • For each SemanticDB file:
    • Read the source file and parse it
    • Traverse the Tree:
      • Bind the node with SemanticDB (by location)
      • Upsert all semantic information as a graph node properties
      • Create new links between symbols 

Semantic Graphs

//File A.scala
package com.virtuslab.semanticgraphs
class A(a: String)
//File B.scala
package com.virtuslab.semanticgraphs
class B(b: String, a: A)

The Jungle

Intellij plugin

IDE integration

New LSP extension?

Visualisation

Visualisation - metals

Possible use cases

  • Learning about the project
    • Project visualisation
    • Project evolution visualisation
  • Refactoring (Rory Graves - tomorrow at 10:15)
    • Automatic module extraction
    • Complex dependency elimination
  • Maintaining
    • Anomaly detaction (automatic with CI)
    • Cleaner architecture advices
2
Made with Slides.com