среда, 19 августа 2015 г.

scala-cpd, yet another scala copy paste detector

You could treat this post as ad. Because, literally it is.

I have realise that our code contains a little bit copy-pasted blocks. Code in Scala, obviously. So, I was looking for perfect working tools for it (at least, to find those blocks). I found only  two non-working  implementations. One is based on PMD (some classic tool for this topic), I cannot execute other variant because of elder sbt version. 

So... I write my own. It's sbt-plugin. 


The code is pretty simple. Scala 2.10 contains macros/run time aka ToolBox. That means, we have AST for free. After that we traverse tree and put all matched subtrees into dictionary (as strings). All matched substrees are blocks/functions_apply/vals/defs and so on. Yeah, and we don't care about comments, because ToolBox-parser perform only AST for code, not for comments.

If two subtrees are equal (syntactically, by _.toString) and  big enough, it's a problem. Houston. 

Fairly, first version is simple. Next version, I hope, would find val-renaming. It should be simple task while we work with AST, not with code-as-string. 

At least, we use it in out CI-process. For each commits, TeamCity execute task with this plugin, and check that count of copy-pasted-blocks is less or equal than for previous commit. 

It would be a big pleasure, if you also will use it.

PS. 3 years ago we use Simian. It's good enough, but it's not a AST, just strings. 

Комментариев нет: