About: Dev Advocate | Developer & architect | Love learning and passing on what I learned!
Location:
Geneva
Joined:
Apr 7, 2020
Improving my previous OpenRewrite recipe
Publish Date: Jun 19
5 0
I started discovering OpenRewrite last week by writing a Kotlin recipe that moves Kotlin files according to the official directory structure recommendation. I mentioned some future works, and here they are. In this post, I want to describe how to compute the root package instead of letting the user set it.
Reminder
I developed last week a recipe to follow the Kotlin recommendation regarding directory structure:
In pure Kotlin projects, the recommended directory structure follows the package structure with the common root package omitted. For example, if all the code in the project is in the org.example.kotlin package and its subpackages, files with the org.example.kotlin package should be placed directly under the source root, and files in org.example.kotlin.network.socket should be in the network/socket subdirectory of the source root.
In a Java project, if you have packages ch.frankel.foo, ch.frankel.bar, and ch.frankel.baz, you'll get the following structure:
src
|__ main
|__ java
|__ ch
|__ frankel
|__ foo
|__ bar
|__ baz
In Kotlin projects, many developers follow the same structure as above, but it can be flattened as:
src
|__ main
|__ kotlin
|__ foo
|__ bar
|__ baz
The work
My recipe's original version mandated that you configure the root package yourself, e.g., ch.frankel for the above example. However, it should be possible to compute it automatically, from looking at the source files. It adds an extra step to the process: before moving the file to the root, the recipe should look at each source file, get the package, compute the longest prefix with the existing root, make it the root, and go to the next source file. The regular Recipe doesn't work in this case. We need to switch to a ScanningRecipe:
If a recipe needs to generate new source files or needs to see all source files before making changes, it must be a ScanningRecipe. A ScanningRecipe extends the normal Recipe and adds two key objects: an accumulator and a scanner. The accumulator object is a custom data structure defined by the recipe itself to store any information the recipe needs to function. The scanner object is a visitor which populates the accumulator with data.
Scanning recipes offer two steps: the first to gather data, the second to do the work.
We must design our algorithm within the constraints of OpenRewrite, and they are the following: in the first phase, for each source file, OpenRewrite will call the getScanner() method that returns a visitor of our choice. In turn, OpenRewrite calls the visitor's methods, which can access the accumulator.
My first naive approach was to use a collection as the accumulator, but it's not necessary. The algorithm is much simpler if we set a mutable placeholder that holds the package root and update it if necessary during each visit. The initial value should be null.
If the value is null, which happens on the first visitor, set the package root to the source file's package.
If the value is an empty string, skip–see below.
In any other case, compute the new package root by finding the longest prefix between the existing package root and the source file's package.
It might result in an empty string, indicating that packages have no common root, e.g., ch.frankel.foo and org.frankel.foo.
Here's the updated code:
classFlattenStructure(privatevalrootPackage:String?):ScanningRecipe<AtomicReference<String?>>(){//1//2constructor():this(null)//3overridefungetDisplayName():String="Flatten Kotlin package directory structure"overridefungetDescription():String="Move Kotlin files to match idiomatic layout by omitting the root package according to the official recommendation."overridefungetInitialValue(ctx:ExecutionContext)=AtomicReference<String?>(null)//4overridefungetScanner(acc:AtomicReference<String?>):TreeVisitor<*,ExecutionContext>{if(rootPackage!=null)returnTreeVisitor.noop<Tree,ExecutionContext>()//5returnobject: KotlinIsoVisitor<ExecutionContext>(){overridefunvisitCompilationUnit(cu:K.CompilationUnit,ctx:ExecutionContext):K.CompilationUnit{valpackageName=cu.packageDeclaration?.packageName?:returncuvalcomputedPackage=acc.get()when(computedPackage){null->acc.set(packageName)//6""->{}//7else->{valcommonPrefix=packageName.commonPrefixWith(computedPackage).removeSuffix(".")//8acc.set(commonPrefix)}}returncu}}}overridefungetVisitor(acc:AtomicReference<String?>):TreeVisitor<*,ExecutionContext>{returnobject: KotlinIsoVisitor<ExecutionContext>(){overridefunvisitCompilationUnit(cu:K.CompilationUnit,ctx:ExecutionContext):K.CompilationUnit{valpackageName=cu.packageDeclaration?.packageName?:returncuvalpackageToSet:String?=rootPackage?:acc.get()//9if(packageToSet==null||packageToSet.isEmpty())returncuvalrelativePath=packageName.removePrefix(packageToSet).removePrefix(".").replace('.','/')valfilename=cu.sourcePath.fileName.toString()valnewPath:Path=Paths.get("src/main/kotlin").resolve(relativePath).resolve(filename)returncu.withSourcePath(newPath)}}}}
Inherit from ScanningRecipe instead of directly from Recipe
Set the accumulator type to be an AtomicReference<String?>
Make the configuration easier when you don't override the package root
The initial root is uninitialized
Skip the computation if the root package is manually set
If it's the first file visited, the accumulator holds null, and we can set the (temporary) root as the current package
One of the previous computations returned no common root–do nothing
Find the longest common prefix between the held package root and the current package
The only difference with the original code:
we check if the root package has been set manually otherwise, we use the one computed in the first pass
Optimizing the recipe
You may have noticed that when there is no common root, e.g., ch.frankel.foo and org.frankel.foo, we scan all files anyway. In a small codebase, it's not a big issue, but when scanning millions of source files, that's a huge waste of CPU cycles and time. If you run the recipe in the Cloud, it directly translates to money. We should stop scanning as soon as we detect the computed package root is an empty string to optimize the recipe.
Here's the updated code:
overridefungetScanner(acc:AtomicReference<String?>):TreeVisitor<*,ExecutionContext>{if(rootPackage!=null)returnTreeVisitor.noop<Tree,ExecutionContext>()//1valcurrentPackage=acc.get()if(currentPackage=="")returnTreeVisitor.noop<Tree,ExecutionContext>()//2returnobject: KotlinIsoVisitor<ExecutionContext>(){overridefunvisitCompilationUnit(cu:K.CompilationUnit,ctx:ExecutionContext):K.CompilationUnit{valpackageName=cu.packageDeclaration?.packageName?:returncu// Different call than the one above!valcurrentPackage=acc.get()// First scanned fileif(currentPackage==null)acc.set(packageName)//3else{// Find the longest common prefix between the stored package and the current onevalcommonPrefix=packageName.commonPrefixWith(currentPackage).removeSuffix(".")acc.set(commonPrefix)}returncu}}}
If the root package has been set, skip visiting
If one of the previous computations sets an empty string, there isn't any common root package: skip visiting
Simplify the visitor by removing the clause when the accumulator is an empty string since it can't happen anymore
Note that OpenRewrite still scans each file, but at least doesn't visit it thanks to the no-op visitor.
Counting visits
I made a couple of attempts before finding the right approach to the above. To ensure that I got it right, I wanted to display the number of visits by the scanner. We can use the accumulator to increment the visit count. Here are the changes I made:
Migrated from an AtomicReference<String?> to an AtomicReference<Pair<Int, String?>>
The getInitialValue() function returns AtomicReference<Pair<Int, String?>>(0 to null)
During each visit:
Get the visits count from the accumulator
Increment it
Print it
Store it back in the accumulator
Update the tests accordingly
With packages ch.frankel.blog.foo, org.frankel.blog.bar, and org.frankel.blog.baz, the log shows:
Because I made these changes just to validate my understanding of how OpenRewrite works, I put them in the visits_count branch on GitHub. To see the differences, execute git diff master visits_count.
Conclusion
In this post, I added the automatic computation of the root package. I had to change my design and understand how scanning recipes work. Then, I skipped further visits when there wasn't any common root package to optimize performance.
The recipe is still not serializable, though it's a recommendation. I also noticed that my tests didn't leverage OpenRewrite's testing API. There's still a lot of work to do!
The complete source code for this post can be found on GitHub:
The Kotlin coding conventions advise omitting the common root package:
In pure Kotlin projects, the recommended directory structure follows the package structure with the common root package omitted
For example, if all the code in the project is in the org.example.kotlin package and its subpackages, files with the org.example.kotlin package should be placed directly under the source root, and files in org.example.kotlin.network.socket should be in the network/socket subdirectory of the source root.