How to determine the name of a variable in a piece of code
I am trying to write a halstead complexity measure in X ++ (language not important) and I think the best way to do this is to use regex in source.
I've managed to do 90% of this, but I'm struggling with variable names.
How to determine the name of a variable in a piece of code.
Considering the following piece of code
public void main()
{
int a, b, c, av;
className class;
strFmt("%1 %2 %3", a, b, c);
av = (a + b + c) / 3;
info("avg = %1");*/
if(a)
{
a++;
class.add(a);
}
else
{
b++;
class.subtract(b)
}
this.main();
}
I expect to return "a" "b" "c" "av" "class"
With halstead, he needs to count their copies. The way I thought it was to store the above in a list and then use whatever is in the list in the regex query. Service for all possible uses of the variable would be insane.
source to share
I ended up cheating the solution. I already had all the operator data like int / public / methods etc., so I just used substituion in the source and then ran the following regex which found the operands for the metric for me.
'_?\w+(?=([^"]*"[^"]*")*[^"]*$)|".+"'
There were some really good answers here, so I'm going to look into a hybrid of them to improve the implementation at a later date, but while we get the information we need and seem to work in all cases, we've tested it.
In case anyone is interested, the regex I used for operators is the following
(?i)\(|\{|\w+(?=(\(|:|\.))|\w+(?=\s\w)|(break|continue|return|true|false|retry|asc|breakpoint|desc|null|pause|throw|ttsAbort|ttsBegin|ttsCommit)(?=;)|((try|catch|else|by|do)(?=\n))|(\+=|-=|>=|<=|==|!=|=|\+\+|--|<<|>>|&&|\|\||\*|\/|\+|-|~|&|\^|\||>|<|!|\?|::|:|\.)+(?=([^"]*"[^"]*")*[^"]*$)
It has all the reserved keywords that are not covered by the first 4 operators, I also got a list of operators that use x ++.
This will require some modification that will be used in other languages, but given that other languages have better ways of solving these problems, you probably don't need to.
Thanks for all your answers
source to share
This question got me curious about how to do it and I came across this great post which has a dedicated AX tool for measuring difficulty as well as a 175 page engraving written about it.
http://bojanjovicic.com/complexity-tool-dynamics-ax-2009/
I'm experimenting with this now and seeing how I can pounce on it.
source to share
I came back with a real answer! Use the object SysScannerClass
and TreeNode
for the proper analysis of the code. Here is a lovely sample I wrote that should make his cake.
static void JobParseSourceCode(Args _args)
{
TreeNode treeNode = TreeNode::findNode(@'\Data Dictionary\Tables\SalesTable\Methods\find');
SysScannerClass sysScannerClass = new SysScannerClass(treeNode);
int symbol;
int curLine;
str lineStr;
setPrefix("Scanning " + treeNode.treeNodePath());
for (symbol = sysScannerClass.firstSymbol(); symbol; symbol = sysScannerClass.nextSymbol())
{
if (curLine != sysScannerClass.line())
{
curLine = sysScannerClass.line();
lineStr = sysScannerClass.sourceLine(curLine);
}
// NOTE: symbol corresponds to macros in #TokenTypes
info(strFmt("Line %1: %2\t(Col %3): '%4' - MacroValue: %5, [%6]", curLine, lineStr, sysScannerClass.col(), sysScannerClass.strValue(), symbol, xppScanner::symbolClass(symbol)));
}
}
source to share
Well, the example doesn't quite qualify as an X ++ source because it class
is a reserved word and cannot be used for a variable name.
Also, a rough search [a-zA-Z_][a-zA-Z_0-9]+
will give you all the strings that could be a variable name. But without a complete parser, it will be difficult for you to determine if it is a keyword, class name, table name, etc. Or the name of a genuine variable.
You can also use TextBuffer
to tokenize your source :
static void TokenTest(Args _args)
{
str src = @'
public void main()
{
int a = 7, b = 11, c = 13, av;
info(strFmt("%1 %2 %3", a, b, c));
av = (a + b + c) / 3;
info(strFmt("avg = %1"));
this.main();
}
';
TextBuffer t = new TextBuffer();
t.ignoreCase(true);
t.setText(src); // Set the text to break in to tokens
while (t.nextToken(false,' (){}.,:;!=+-*/\n')) // The delimiters to search
{
info(t.token());
}
}
This won't work with lines and comments, of course.
There is even an undocumented core class called Keywords !
Perhaps the best option would be to integrate with the cross-referencing tool , it did the splitting for you!
I'm afraid your remaining 10% might take 90% of your time!
source to share