Find the most frequently edited files in the box

We are currently planning a quality improvement exercise and I would like to target the most edited files in our transparencies. Since we've just gone through the bug fixing phase, the most frequently edited files should be good at pointing out where the code is most error-prone, and therefore most in need of quality improvement.

Does anyone know if there is a way to get a list of the 100 most edited files? This will preferably cover changes that occur across multiple branches.

+2


source to share


2 answers


(the previous answer was for a simpler case: single branch )

Since "most dev projects did not all happen on the same branch, so version numbers do not necessarily mean most edited", the "way to get the number of checks across all branches" would be:

  • find all versions created since the date of the last bug fix phase,
  • sort them by file,
  • then on entry.

Something line by line:

C:\Prog\cc\test\test>ct find -all -type f -ver "created_since(16-Oct-2009)" -exec "cleartool descr -fmt """%En~%Sn\n""""""%CLEARCASE_XPN%"""" | grep -v "\\0" | awk -F ~ "{print $1}" | sort | uniq -c | sort /R | head -100

      



Or, for Unix syntax:

$ ct find -all -type f -ver 'created_since(16-Oct-2009)' -exec 'cleartool descr -fmt "%En~%Sn\n" "%CLEARCASE_XPN%"' | grep -v "/0"  | awk -F ~ '{print $1}' | sort | uniq -c | sort -rn | head -100

      

  • replace the date with a label indicating the start of the error correction phase.
  • Note again the double quotes around " %CLEARCASE_XPN%

    " to place spaces in the filenames.
  • < %CLEARCASE_XPN%

    ' Is used here , not' %CLEARCASE_PN%

    'because we want all versions.
  • grep -v "/0"

    here to exclude version 0 ( /main/0

    , /main/myBranch/0

    , ...)
  • awk -F ~ "{print $1}"

    used to print only the first part of each line:
    C:\Prog\cc\test\test\a.txt~\main\mybranch\2

    becomesC:\Prog\cc\test\test\a.txt

  • From there, counting and sorting will start:
    • sort

      to make sure every identical row is grouped.
    • uniq -c

      to remove duplicate lines and count the number of duplicates before each remaining line
    • sort -rn

      (or sort /R

      for Windows) for the most editable files at the top
    • head -100

      to save only the 100 most edited files.

Again, GnuWin32 comes in handy for a single layer version of Windows.

+1


source


(see answer for more complex case: multiple branches )

First, use a dynamic view: it's easier and faster to update its content and tinker with its configuration specification rules.

If a bug is fixed in a branch starting at a given tag, set up a dynamic view with the following config spec as:

element * .../MY_BRANCH/LATEST
element * MY_STARTING_LABEL
element * /main/LATEST

      

Then you will find all files with their current version number (closely related to the number of changes)

ct find . -type f -exec "cleartool desc -fmt """%Ln\t\t%En\n""" """%CLEARCASE_PN%""""|sort /R|head -100

      



This is Windows syntax (except for triple "double quotes" around %CLEARCASE_PN%

to place spaces in filenames.

command ' head

' comes from the GnuWin32 library .
The most edited version is at the top of the list.

The Unix version will be as follows:

$ ct find . -type f -exec 'cleartool desc -fmt "%Ln\t\t%En\n" "$CLEARCASE_PN"' | sort -rn | head -100

      

The most edited version will be at the top.

Don't forget that raw numbers are not enough for metrics, trends are also important .

+1


source







All Articles