Helper script to populate reports to analyze code evolution overtime based on Codee metrics#10
Helper script to populate reports to analyze code evolution overtime based on Codee metrics#10tamara-laranga wants to merge 2 commits intomainfrom
Conversation
ulisescosti
left a comment
There was a problem hiding this comment.
Good to know that we’re finally about to have a post-processing script ready to be shared publicly!
I left some comments and suggestions in the review.
Additionally, I suggest to add a sub directory in scripts. For example:
scripts/post_processing_codee_json_reports/
e77e133 to
e3f687d
Compare
ulisescosti
left a comment
There was a problem hiding this comment.
Now I took a quick look at the Python script and left some comments and suggestions.
|
|
||
| try: | ||
| content = report_js_path.read_text(encoding="utf-8") | ||
| match = re.search(r"const\s+report\s*=\s*(\{.*\});", content, re.DOTALL) |
There was a problem hiding this comment.
I miss a comment here indicating the purpose of these lines.
| with open(file_path, "r", encoding="utf-8") as f: | ||
| return json.load(f) | ||
| except json.JSONDecodeError as e: | ||
| logger.warning(f"Malformed JSON in {file_path.name}: {e}") |
There was a problem hiding this comment.
| logger.warning(f"Malformed JSON in {file_path.name}: {e}") | |
| logger.error(f"Malformed JSON in {file_path.name}: {e}") |
If the issue clearly prevents the program from performing the given task, I think it should be an error message rather than a warning.
| logger.warning(f"Malformed JSON in {file_path.name}: {e}") | ||
| return None | ||
| except OSError as e: | ||
| logger.warning(f"Failed to read {file_path.name}: {e}") |
There was a problem hiding this comment.
| logger.warning(f"Failed to read {file_path.name}: {e}") | |
| logger.error(f"Failed to read {file_path.name}: {e}") |
| """Load a Codee HTML report by parsing report.js file.""" | ||
| report_js_path = report_dir / "report.js" | ||
| if not report_js_path.exists(): | ||
| logger.warning(f"report.js not found in {report_dir.name}") |
There was a problem hiding this comment.
| logger.warning(f"report.js not found in {report_dir.name}") | |
| logger.error(f"report.js not found in {report_dir.name}") |
| def load_html_report(report_dir: Path, logger: logging.Logger) -> dict | None: | ||
| """Load a Codee HTML report by parsing report.js file.""" | ||
| report_js_path = report_dir / "report.js" | ||
| if not report_js_path.exists(): |
There was a problem hiding this comment.
Why isn't this conditional withing the try-catch block? It might also be a separate try-catch block.
| reports.append((timestamp, report, report_dir)) | ||
|
|
||
| elif input_type == "json": | ||
| logger.info("Detected JSON format (recursive search for *.json)") |
There was a problem hiding this comment.
| logger.info("Detected JSON format (recursive search for *.json)") | |
| logger.info("Detected JSON format") |
Idem
| l_level_data[l_level].append(p_groups[l_level]) | ||
|
|
||
| # Build links to original reports (JSON or HTML) | ||
| report_links: list[dict | None] = [] |
There was a problem hiding this comment.
I think it would be valuable to leave a basic example of the data structure in comments. It would help the code to be easier to understand.
|
|
||
| def main() -> int: | ||
| parser = argparse.ArgumentParser( | ||
| description="Analyze Codee JSON or HTML reports and generate HTML visualizations." |
There was a problem hiding this comment.
| description="Analyze Codee JSON or HTML reports and generate HTML visualizations." | |
| description="Parse Codee JSON or HTML reports and generate an HTML summary visualization." |
| parser.add_argument( | ||
| "output_dir", | ||
| type=Path, | ||
| help="Directory where HTML report will be saved", |
There was a problem hiding this comment.
| help="Directory where HTML report will be saved", | |
| help="Directory where the HTML report will be saved", |
| reports = load_reports(args.input_dir, logger) | ||
|
|
||
| if not reports: | ||
| logger.warning("No valid reports loaded. Exiting.") |
There was a problem hiding this comment.
| logger.warning("No valid reports loaded. Exiting.") | |
| logger.error("No valid reports loaded. Exiting.") |
|
I've run the script with a sample input dataset, and it looks great! Regarding the UX, I would recommend including a solid example input dataset in the repository, along with the exact commands users need to follow to parse the data and generate the summary HTML report. In my opinion, these example datasets should include at least five JSON reports and five HTML reports. What do you think? |
Add a script that generates an HTML report from Codee HTML/JSON outputs. The report summarizes the evolution of Codee checkers across the codebase, enabling comparison of findings between runs and helping track improvements or regressions over time.