Find the most frequently edited files in the box

Question

Find the most frequently edited files in the box

We are currently planning a quality improvement exercise and I would like to target the most edited files in our transparencies. Since we've just gone through the bug fixing phase, the most frequently edited files should be good at pointing out where the code is most error-prone, and therefore most in need of quality improvement.

Does anyone know if there is a way to get a list of the 100 most edited files? This will preferably cover changes that occur across multiple branches.

+2

version-control clearcase refactoring

mR_fr0g 16 oct. '09 at 9:01

source to share

2 answers

(see answer for more complex case: multiple branches )

First, use a dynamic view: it's easier and faster to update its content and tinker with its configuration specification rules.

If a bug is fixed in a branch starting at a given tag, set up a dynamic view with the following config spec as:

element * .../MY_BRANCH/LATEST
element * MY_STARTING_LABEL
element * /main/LATEST

Then you will find all files with their current version number (closely related to the number of changes)

ct find . -type f -exec "cleartool desc -fmt """%Ln\t\t%En\n""" """%CLEARCASE_PN%""""|sort /R|head -100

This is Windows syntax (except for triple "double quotes" around %CLEARCASE_PN%

to place spaces in filenames.

command ' head

' comes from the GnuWin32 library .
The most edited version is at the top of the list.

The Unix version will be as follows:

$ ct find . -type f -exec 'cleartool desc -fmt "%Ln\t\t%En\n" "$CLEARCASE_PN"' | sort -rn | head -100

The most edited version will be at the top.

Don't forget that raw numbers are not enough for metrics, trends are also important .

+1

VonC 16 oct. '09 at 11:51

source to share

VonC · Accepted Answer · 2009-10-16T22:53:41+0000

(the previous answer was for a simpler case: single branch )

Since "most dev projects did not all happen on the same branch, so version numbers do not necessarily mean most edited", the "way to get the number of checks across all branches" would be:

find all versions created since the date of the last bug fix phase,
sort them by file,
then on entry.

Something line by line:

C:\Prog\cc\test\test>ct find -all -type f -ver "created_since(16-Oct-2009)" -exec "cleartool descr -fmt """%En~%Sn\n""""""%CLEARCASE_XPN%"""" | grep -v "\\0" | awk -F ~ "{print $1}" | sort | uniq -c | sort /R | head -100

Or, for Unix syntax:

$ ct find -all -type f -ver 'created_since(16-Oct-2009)' -exec 'cleartool descr -fmt "%En~%Sn\n" "%CLEARCASE_XPN%"' | grep -v "/0"  | awk -F ~ '{print $1}' | sort | uniq -c | sort -rn | head -100

replace the date with a label indicating the start of the error correction phase.
Note again the double quotes around " %CLEARCASE_XPN%

" to place spaces in the filenames.
< %CLEARCASE_XPN%

' Is used here , not' %CLEARCASE_PN%

'because we want all versions.
grep -v "/0"

here to exclude version 0 ( /main/0

, /main/myBranch/0

, ...)
awk -F ~ "{print $1}"

used to print only the first part of each line:
C:\Prog\cc\test\test\a.txt~\main\mybranch\2

becomesC:\Prog\cc\test\test\a.txt
From there, counting and sorting will start:
- sort
  
  to make sure every identical row is grouped.
- uniq -c
  
  to remove duplicate lines and count the number of duplicates before each remaining line
- sort -rn
  
  (or sort /R
  
  for Windows) for the most editable files at the top
- head -100
  
  to save only the 100 most edited files.

Again, GnuWin32 comes in handy for a single layer version of Windows.

Find the most frequently edited files in the box

More articles: