-
Notifications
You must be signed in to change notification settings - Fork 533
do not merge: added load scripts for definitions #3438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
do not merge: added load scripts for definitions #3438
Conversation
📝 WalkthroughWalkthroughIntroduces five new Django management commands for loading EMR data (Activity Definitions, Charge Item Definitions, Observation Definitions, and Specimen Definitions) from CSV files or Google Sheets, along with a shared utility module providing CSV reading, code validation, normalization, and output generation functions. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Pre-merge checks and finishing touches❌ Failed checks (2 warnings, 1 inconclusive)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Nitpick comments (11)
care/emr/management/commands/load_specimen_definition.py (2)
111-111: Unusedfacilityparameter inprocess_row.The
facilityparameter is declared but never used in this method. Either remove it to fix the linter warning, or document why it's kept for interface consistency with other commands.🔎 Proposed fix
- def process_row(self, row: dict, facility: Facility) -> dict: + def process_row(self, row: dict) -> dict:Then update the call site at line 304:
- data = self.process_row(row, facility) + data = self.process_row(row)
319-320: Uselogger.exceptioninstead oflogger.errorfor proper stack trace logging.When catching exceptions,
logger.exceptionautomatically includes the full stack trace, which is valuable for debugging production issues. The currentlogger.erroronly logs the message.🔎 Proposed fix
except Exception as e: - logger.error("Error processing row '%s': %s", row_title, e) + logger.exception("Error processing row '%s'", row_title) failed.append(row_title)care/emr/management/commands/load_activity_definition.py (2)
106-117: Argument definition for--validate-codesis somewhat unconventional.Using
action="store_true"withdefault=Trueis redundant—the flag will beTruewhether or not it's passed. The--no-validate-codesflag correctly sets it toFalse. While this works, it might confuse future maintainers. A clearer approach would be to just set the default and rely solely on--no-validate-codes.🔎 Proposed simplification
parser.add_argument( "--validate-codes", - action="store_true", - default=True, - help="Validate codes against valuesets (default: True)", - ) - parser.add_argument( - "--no-validate-codes", - action="store_false", - dest="validate_codes", - help="Skip code validation", + action=argparse.BooleanOptionalAction, + default=True, + help="Validate codes against valuesets (default: True)", )Or simply keep the boolean default and document the negation:
parser.add_argument( - "--validate-codes", - action="store_true", + "--skip-code-validation", + action="store_true", default=True, - help="Validate codes against valuesets (default: True)", - ) - parser.add_argument( - "--no-validate-codes", - action="store_false", - dest="validate_codes", - help="Skip code validation", + dest="validate_codes", + help="Skip code validation against valuesets", )
523-524: Uselogger.exceptionfor proper exception logging.Same issue as in other commands—
logger.exceptioncaptures the full traceback automatically.🔎 Proposed fix
except Exception as e: - logger.error("Error processing row '%s': %s", row_title, e) + logger.exception("Error processing row '%s'", row_title) failed.append(row_title)care/emr/management/commands/load_observation_definition.py (3)
130-131: Moveimport jsonto the top of the file.Importing inside a method works but is unconventional and slightly less efficient (the import check runs on every call). Standard practice is to place imports at the module level.
🔎 Proposed fix
Add at the top of the file with other imports:
import logging +import json from datetime import UTC, datetimeThen remove line 131.
123-125: Unusedfacilityparameter inprocess_row.The
facilityparameter is declared but never referenced within the method body. If it's intentionally kept for API consistency across commands, consider adding a comment or using_prefix.🔎 Proposed fix
def process_row( - self, row: dict, facility: Facility | None, validate_codes: bool + self, row: dict, validate_codes: bool ) -> dict:Update call site at line 359:
- data = self.process_row(row, facility, validate_codes) + data = self.process_row(row, validate_codes)
375-376: Uselogger.exceptionfor proper stack trace logging.Consistent with the other commands, this should use
logger.exceptionto capture full tracebacks.🔎 Proposed fix
except Exception as e: - logger.error("Error processing row '%s': %s", row_title, e) + logger.exception("Error processing row '%s'", row_title) failed.append(row_title)care/emr/management/commands/load_emr_utils.py (2)
113-128: Consider adding basic retry logic for HTTP requests.Network requests to Google Sheets or external URLs may experience transient failures. A simple retry with backoff would improve reliability for batch operations.
🔎 Example with tenacity or manual retry
from requests.adapters import HTTPAdapter from urllib3.util.retry import Retry def read_csv_from_url(url: str) -> list[dict[str, str]]: """Read CSV from a URL with retry support.""" logger.info("Reading CSV from URL: %s", url) session = requests.Session() retries = Retry(total=3, backoff_factor=0.5, status_forcelist=[500, 502, 503, 504]) session.mount("https://", HTTPAdapter(max_retries=retries)) response = session.get(url, timeout=30) response.raise_for_status() # ... rest of the function
204-206: Uselogger.exceptionfor proper stack trace capture.When catching exceptions,
logger.exceptionprovides valuable debugging information by including the full traceback.🔎 Proposed fix
except Exception as e: - logger.error("Error validating code %s: %s", code, e) + logger.exception("Error validating code %s", code) return Falsecare/emr/management/commands/load_charge_item_definition.py (2)
189-196: Silent fallback to 0.0 for invalid base prices may mask data issues.When
Base Pricecannot be parsed, the code silently defaults to0.0. This could hide data quality problems in the source CSV. Consider logging a warning so operators are aware of unparseable values.🔎 Proposed fix
try: base_price = float(base_price_str) except (ValueError, TypeError): + logger.warning( + "Invalid base price '%s' for '%s', defaulting to 0.0", + row.get("Base Price"), + row.get("title"), + ) base_price = 0.0
341-342: Uselogger.exceptionfor proper stack trace logging.Consistent with the pattern across all commands, this should capture full tracebacks.
🔎 Proposed fix
except Exception as e: - logger.error("Error processing row '%s': %s", row_title, e) + logger.exception("Error processing row '%s'", row_title) failed.append(row_title)
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
care/emr/management/commands/load_activity_definition.pycare/emr/management/commands/load_charge_item_definition.pycare/emr/management/commands/load_emr_utils.pycare/emr/management/commands/load_observation_definition.pycare/emr/management/commands/load_specimen_definition.py
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (.cursorrules)
**/*.py: Prioritize readability and maintainability; follow Django's coding style guide (PEP 8 compliance).
Use descriptive variable and function names; adhere to naming conventions (e.g., lowercase with underscores for functions and variables).
Files:
care/emr/management/commands/load_activity_definition.pycare/emr/management/commands/load_charge_item_definition.pycare/emr/management/commands/load_observation_definition.pycare/emr/management/commands/load_specimen_definition.pycare/emr/management/commands/load_emr_utils.py
**/{models,views,management/commands}/*.py
📄 CodeRabbit inference engine (.cursorrules)
Leverage Django’s ORM for database interactions; avoid raw SQL queries unless necessary for performance.
Files:
care/emr/management/commands/load_activity_definition.pycare/emr/management/commands/load_charge_item_definition.pycare/emr/management/commands/load_observation_definition.pycare/emr/management/commands/load_specimen_definition.pycare/emr/management/commands/load_emr_utils.py
🧬 Code graph analysis (2)
care/emr/management/commands/load_activity_definition.py (9)
care/emr/management/commands/load_emr_utils.py (9)
create_slug(83-101)ensure_category(241-281)normalize_title(17-80)parse_code(137-160)read_csv_from_file(104-110)read_csv_from_google_sheet(131-134)read_csv_from_url(113-128)validate_and_substitute_code(222-238)write_output_csv(163-185)care/emr/models/activity_definition.py (1)
ActivityDefinition(7-39)care/emr/models/charge_item_definition.py (1)
ChargeItemDefinition(6-24)care/emr/models/location.py (1)
FacilityLocation(12-127)care/emr/models/observation_definition.py (1)
ObservationDefinition(6-27)care/emr/models/specimen_definition.py (1)
SpecimenDefinition(6-23)care/emr/management/commands/load_charge_item_definition.py (1)
handle(280-374)care/emr/management/commands/load_observation_definition.py (1)
handle(308-409)care/emr/management/commands/load_specimen_definition.py (1)
handle(258-352)
care/emr/management/commands/load_emr_utils.py (4)
care/emr/registries/care_valueset/care_valueset.py (1)
valueset(45-46)care/emr/resources/common/coding.py (1)
Coding(4-13)care/emr/models/resource_category.py (1)
ResourceCategory(10-86)care/emr/models/base.py (1)
calculate_slug_from_facility(39-40)
🪛 GitHub Actions: Lint Code Base
care/emr/management/commands/load_activity_definition.py
[error] 436-436: PLR0915 Too many statements (59 > 50)
care/emr/management/commands/load_observation_definition.py
[error] 123-123: PLR0912 Too many branches (15 > 12)
[error] 123-123: PLR0915 Too many statements (53 > 50)
care/emr/management/commands/load_specimen_definition.py
[error] 111-111: PLR0912 Too many branches (12 > 12)
[error] 123-123: PLR0915 Too many statements (54 > 50)
care/emr/management/commands/load_emr_utils.py
[error] 97-97: PLR2004 Magic value used in comparison, consider replacing 25 with a constant variable
🪛 Ruff (0.14.10)
care/emr/management/commands/load_activity_definition.py
149-149: Avoid specifying long messages outside the exception class
(TRY003)
223-223: Abstract raise to an inner function
(TRY301)
223-223: Avoid specifying long messages outside the exception class
(TRY003)
428-428: Consider moving this statement to an else block
(TRY300)
436-436: Unused method argument: args
(ARG002)
523-523: Do not catch blind exception: Exception
(BLE001)
524-524: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
care/emr/management/commands/load_charge_item_definition.py
95-95: Avoid specifying long messages outside the exception class
(TRY003)
187-187: Abstract raise to an inner function
(TRY301)
187-187: Avoid specifying long messages outside the exception class
(TRY003)
272-272: Consider moving this statement to an else block
(TRY300)
280-280: Unused method argument: args
(ARG002)
341-341: Do not catch blind exception: Exception
(BLE001)
342-342: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
care/emr/management/commands/load_observation_definition.py
121-121: Avoid specifying long messages outside the exception class
(TRY003)
124-124: Unused method argument: facility
(ARG002)
134-134: Abstract raise to an inner function
(TRY301)
134-134: Avoid specifying long messages outside the exception class
(TRY003)
300-300: Consider moving this statement to an else block
(TRY300)
308-308: Unused method argument: args
(ARG002)
375-375: Do not catch blind exception: Exception
(BLE001)
376-376: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
care/emr/management/commands/load_specimen_definition.py
109-109: Avoid specifying long messages outside the exception class
(TRY003)
111-111: Unused method argument: facility
(ARG002)
118-118: Abstract raise to an inner function
(TRY301)
118-118: Avoid specifying long messages outside the exception class
(TRY003)
126-126: Abstract raise to an inner function
(TRY301)
126-126: Avoid specifying long messages outside the exception class
(TRY003)
250-250: Consider moving this statement to an else block
(TRY300)
258-258: Unused method argument: args
(ARG002)
319-319: Do not catch blind exception: Exception
(BLE001)
320-320: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
care/emr/management/commands/load_emr_utils.py
204-204: Do not catch blind exception: Exception
(BLE001)
205-205: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
277-277: Consider moving this statement to an else block
(TRY300)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Test / Test
🔇 Additional comments (4)
care/emr/management/commands/load_activity_definition.py (1)
151-188: Dependency loading silently proceeds even if called commands fail.If any of the
call_commandinvocations raise an exception (e.g., due to invalid CSV data), the exception will propagate and halt the main command. However, if the dependency commands log errors but return normally (with failures), the main command proceeds unaware that some dependencies may be missing. This could lead to confusing cascading "missing dependency" errors later.Consider capturing and surfacing dependency loading results, or at least documenting this behavior.
care/emr/management/commands/load_emr_utils.py (1)
17-80: LGTM onnormalize_titleimplementation.The title normalization logic is thorough—handling spacing, punctuation, and preserving medical abbreviations. The set of uppercase words covers common medical terms nicely.
care/emr/management/commands/load_charge_item_definition.py (2)
97-178: Tax components structure looks well-designed for Indian GST.The CGST/SGST split at half the total rate (2.5%/2.5% for 5%, 6%/6% for 12%, 9%/9% for 18%) correctly reflects Indian GST regulations. The warning for unknown tax rates is helpful.
1-374: Overall: Well-structured command that follows established patterns.The command follows the same consistent patterns as the other loaders, making the codebase predictable and maintainable. The tax handling logic is thoughtful and the batch processing implementation is solid.
| def handle(self, *args, **options): | ||
| start_time = datetime.now(tz=UTC) | ||
|
|
||
| # Set logging level | ||
| if options["verbosity"] == 0: | ||
| logger.setLevel(logging.ERROR) | ||
| elif options["verbosity"] == 1: | ||
| logger.setLevel(logging.INFO) | ||
| else: | ||
| logger.setLevel(logging.DEBUG) | ||
|
|
||
| try: | ||
| # Get facility | ||
| facility = Facility.objects.get(external_id=options["facility"]) | ||
| logger.info("Loading activities for facility: %s", facility.name) | ||
|
|
||
| # Load dependencies first | ||
| self.load_dependencies(facility, options) | ||
|
|
||
| # Load activity data | ||
| logger.info("\n=== Loading Activity Definitions ===") | ||
| rows = self.load_data(options) | ||
| logger.info("Loaded %d rows from source", len(rows)) | ||
|
|
||
| if not rows: | ||
| self.stdout.write(self.style.WARNING("No rows found in source")) | ||
| return | ||
|
|
||
| # Process rows in batches | ||
| batch_size = options["batch_size"] | ||
| validate_codes = options["validate_codes"] | ||
| total_rows = len(rows) | ||
| successful = [] | ||
| failed = [] | ||
| output_rows = [] | ||
|
|
||
| for i in range(0, total_rows, batch_size): | ||
| batch = rows[i : i + batch_size] | ||
| batch_num = (i // batch_size) + 1 | ||
| total_batches = (total_rows + batch_size - 1) // batch_size | ||
|
|
||
| logger.info( | ||
| "Processing batch %d/%d (rows %d-%d)", | ||
| batch_num, | ||
| total_batches, | ||
| i + 1, | ||
| min(i + batch_size, total_rows), | ||
| ) | ||
|
|
||
| for row in batch: | ||
| row_title = row.get("title", "Unknown") | ||
| slug_value = "" | ||
|
|
||
| try: | ||
| data = self.process_row(row, facility, validate_codes, None) | ||
| slug_value = data["slug_value"] | ||
|
|
||
| data, missing = self.resolve_dependencies(data, facility) | ||
|
|
||
| if missing: | ||
| error_msg = f"Missing dependencies: {', '.join(missing)}" | ||
| logger.warning("Skipping %s: %s", data["title"], error_msg) | ||
| failed.append(slug_value) | ||
| output_rows.append( | ||
| { | ||
| "title": data["title"], | ||
| "slug_value": slug_value, | ||
| "status": "Failed", | ||
| "error": error_msg, | ||
| "code_substitutions": data.get("substitutions", ""), | ||
| } | ||
| ) | ||
| continue | ||
|
|
||
| self.create_activity_definition(data, facility, None) | ||
|
|
||
| successful.append(slug_value) | ||
| output_rows.append( | ||
| { | ||
| "title": data["title"], | ||
| "slug_value": slug_value, | ||
| "status": "Success", | ||
| "error": "", | ||
| "code_substitutions": data.get("substitutions", ""), | ||
| } | ||
| ) | ||
|
|
||
| except Exception as e: | ||
| logger.error("Error processing row '%s': %s", row_title, e) | ||
| failed.append(row_title) | ||
| output_rows.append( | ||
| { | ||
| "title": row_title, | ||
| "slug_value": slug_value, | ||
| "status": "Failed", | ||
| "error": str(e), | ||
| "code_substitutions": "", | ||
| } | ||
| ) | ||
|
|
||
| # Write output CSV | ||
| output_path = options.get("output") or default_output_path | ||
| if output_path: | ||
| write_output_csv( | ||
| output_path, | ||
| output_rows, | ||
| ["title", "slug_value", "status", "error", "code_substitutions"], | ||
| ) | ||
|
|
||
| # Print summary | ||
| self.stdout.write("\n=== Summary ===") | ||
| self.stdout.write(f"Total rows: {total_rows}") | ||
| self.stdout.write(self.style.SUCCESS(f"Successful: {len(successful)}")) | ||
| self.stdout.write(self.style.ERROR(f"Failed: {len(failed)}")) | ||
| self.stdout.write(f"Time taken: {datetime.now(tz=UTC) - start_time}") | ||
| self.stdout.write( | ||
| self.style.SUCCESS("Activity definitions loaded successfully") | ||
| ) | ||
|
|
||
| except Exception as e: | ||
| logger.exception("Error in main process") | ||
| error_message = f"Error in main process: {e}" | ||
| self.stdout.write(self.style.ERROR(error_message)) | ||
| raise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pipeline failure: handle method exceeds statement limit (59 > 50).
This mirrors the pattern in the other loader commands. Consider extracting batch processing, dependency loading orchestration, and summary printing into helper methods. This will make the code more maintainable and resolve the linter error.
🧰 Tools
🪛 GitHub Actions: Lint Code Base
[error] 436-436: PLR0915 Too many statements (59 > 50)
🪛 Ruff (0.14.10)
436-436: Unused method argument: args
(ARG002)
523-523: Do not catch blind exception: Exception
(BLE001)
524-524: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
🤖 Prompt for AI Agents
In care/emr/management/commands/load_activity_definition.py around lines
436-559, the handle method is too large (59 statements) and should be split to
satisfy the linter: extract (1) dependency/load orchestration into a helper like
load_all_dependencies(self, facility, options), (2) batch processing of rows
into process_batches(self, rows, facility, validate_codes, batch_size) which
returns (successful, failed, output_rows), and (3) summary/output writing into
write_summary_and_output(self, start_time, total_rows, successful, failed,
output_rows, options). Replace the inlined blocks with calls to these helpers,
ensure params passed (options, facility, rows, validate_codes, batch_size,
start_time), preserve logging and exception handling in handle (catch top-level
exceptions and re-raise) and keep helper methods unit-testable and small to
reduce statements in handle.
| def create_slug(name: str) -> str: | ||
| """ | ||
| Create a slug from a name. | ||
| Matches the TypeScript createSlug function. | ||
| """ | ||
| if not name: | ||
| return "" | ||
|
|
||
| slug = name.lower() | ||
| slug = re.sub(r"[^a-z0-9\s_-]", "", slug) | ||
| slug = re.sub(r"\s+", "-", slug) | ||
| slug = re.sub(r"-+", "-", slug) | ||
| slug = slug.strip() | ||
| slug = slug[:25] | ||
| if len(slug) < 25: | ||
| hash_suffix = hashlib.sha256(slug.encode()).hexdigest() | ||
| needed_hash = 25 - len(slug) - 1 | ||
| slug = slug + "-" + hash_suffix[:needed_hash] | ||
| return slug |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extract magic number 25 to a named constant.
The pipeline reports PLR2004 for the magic value. While the comment explains this matches a TypeScript function, a named constant would improve clarity and satisfy the linter.
🔎 Proposed fix
+SLUG_MAX_LENGTH = 25
+
+
def create_slug(name: str) -> str:
"""
Create a slug from a name.
Matches the TypeScript createSlug function.
"""
if not name:
return ""
slug = name.lower()
slug = re.sub(r"[^a-z0-9\s_-]", "", slug)
slug = re.sub(r"\s+", "-", slug)
slug = re.sub(r"-+", "-", slug)
slug = slug.strip()
- slug = slug[:25]
- if len(slug) < 25:
+ slug = slug[:SLUG_MAX_LENGTH]
+ if len(slug) < SLUG_MAX_LENGTH:
hash_suffix = hashlib.sha256(slug.encode()).hexdigest()
- needed_hash = 25 - len(slug) - 1
+ needed_hash = SLUG_MAX_LENGTH - len(slug) - 1
slug = slug + "-" + hash_suffix[:needed_hash]
return slug🧰 Tools
🪛 GitHub Actions: Lint Code Base
[error] 97-97: PLR2004 Magic value used in comparison, consider replacing 25 with a constant variable
🤖 Prompt for AI Agents
In care/emr/management/commands/load_emr_utils.py around lines 83 to 101, the
repeated literal 25 used to limit slug length should be extracted to a
module-level named constant (e.g., MAX_SLUG_LENGTH = 25) to remove the magic
number; define the constant near the top of the file, replace all occurrences of
25 in this function with that constant (including the slice, the length
comparison, and the hash suffix calculation), and ensure the needed_hash
calculation uses MAX_SLUG_LENGTH so behavior remains identical.
| def ensure_category(category_name: str, facility, resource_type: str, created_by=None): | ||
| """ | ||
| Ensure a ResourceCategory exists for the given name, if not create it. | ||
| Returns the ResourceCategory object. | ||
| Raises exceptions on database or validation errors. | ||
| """ | ||
| from care.emr.models.resource_category import ResourceCategory | ||
|
|
||
| try: | ||
| category_title = normalize_title(category_name) | ||
| category_slug_value = create_slug(category_name) | ||
| category_slug = ResourceCategory.calculate_slug_from_facility( | ||
| str(facility.external_id), category_slug_value | ||
| ) | ||
|
|
||
| # Check if exists | ||
| category = ResourceCategory.objects.filter( | ||
| slug=category_slug, facility=facility | ||
| ).first() | ||
|
|
||
| if category: | ||
| return category | ||
|
|
||
| # Create new category | ||
| category = ResourceCategory( | ||
| facility=facility, | ||
| resource_type=resource_type, | ||
| resource_sub_type="other", | ||
| title=category_title, | ||
| slug=category_slug, | ||
| description=f"Auto-generated category for {category_title}", | ||
| created_by=created_by, | ||
| updated_by=created_by, | ||
| ) | ||
| category.save() | ||
| logger.info("Created category: %s", category_title) | ||
| return category | ||
|
|
||
| except Exception as e: | ||
| error_message = f"Failed to ensure category '{category_name}': {e}" | ||
| raise RuntimeError(error_message) from e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
rg -n "resource_type" care/emr/models/resource_category.py -A 2 -B 2Repository: ohcnetwork/care
Length of output: 268
🏁 Script executed:
cat care/emr/models/resource_category.pyRepository: ohcnetwork/care
Length of output: 3025
🏁 Script executed:
# Search for resource_type constant definitions or usage patterns
rg -n "resource_type\s*=" care/emr --type py | head -20Repository: ohcnetwork/care
Length of output: 2198
🏁 Script executed:
# Also check for any choices or constants defined for resource types
rg -n "RESOURCE_TYPE|resource_type.*choice" care/emr --type py -iRepository: ohcnetwork/care
Length of output: 28120
🏁 Script executed:
rg -rn "ResourceCategoryResourceTypeOptions" care/emr --type pyRepository: ohcnetwork/care
Length of output: 189
🏁 Script executed:
cat care/emr/resources/resource_category/spec.pyRepository: ohcnetwork/care
Length of output: 1829
Validate resource_type against ResourceCategoryResourceTypeOptions enum.
The function accepts any resource_type string, but ResourceCategoryResourceTypeOptions defines only three valid values: product_knowledge, activity_definition, and charge_item_definition. Validating the input against this enum would catch typos or misconfiguration early.
🧰 Tools
🪛 Ruff (0.14.10)
277-277: Consider moving this statement to an else block
(TRY300)
🤖 Prompt for AI Agents
In care/emr/management/commands/load_emr_utils.py around lines 241 to 281, the
function ensure_category accepts any resource_type string but must validate it
against ResourceCategoryResourceTypeOptions; import
ResourceCategoryResourceTypeOptions from care.emr.models.resource_category (or
its module), perform a validation check early (before creating or querying
ResourceCategory) to ensure resource_type is one of the enum's allowed values,
and if not raise a clear ValueError (or RuntimeError consistent with surrounding
code) with a message listing valid options; keep the rest of the flow unchanged
so validated resource_type is used for the query/creation.
| def process_row( | ||
| self, row: dict, facility: Facility | None, validate_codes: bool | ||
| ) -> dict: | ||
| """ | ||
| Process a single CSV row into an ObservationDefinition data dict. | ||
| Raises exceptions with descriptive messages on errors. | ||
| """ | ||
| try: | ||
| import json | ||
|
|
||
| if not row.get("title"): | ||
| raise ValueError("Missing required field: title") | ||
|
|
||
| code_value = row.get("code_value") | ||
| code_system = row.get("code_system", "http://loinc.org") | ||
| code_display = row.get("code_display") | ||
|
|
||
| # Default code for observation | ||
| default_code = { | ||
| "code": "104922-0", | ||
| "system": "http://loinc.org", | ||
| "display": "Laboratory test details panel", | ||
| } | ||
|
|
||
| substitution_messages = [] | ||
|
|
||
| # Validate code if requested | ||
| if validate_codes and code_value: | ||
| code, sub_msg = validate_and_substitute_code( | ||
| code_value, | ||
| code_system, | ||
| "system-observation", | ||
| default_code, | ||
| ) | ||
| if sub_msg: | ||
| substitution_messages.append(f"code: {sub_msg}") | ||
| else: | ||
| code = parse_code(code_value, code_system, code_display) or default_code | ||
|
|
||
| body_site = parse_code( | ||
| row.get("body_site_code"), | ||
| row.get("body_site_system", "http://snomed.info/sct"), | ||
| row.get("body_site_display"), | ||
| ) | ||
|
|
||
| # Parse optional method | ||
| method_code = row.get("method_code") | ||
| method_system = row.get( | ||
| "method_system", | ||
| "http://terminology.hl7.org/CodeSystem/observation-methods", | ||
| ) | ||
| method_display = row.get("method_display") | ||
|
|
||
| default_method = { | ||
| "code": "386053000", | ||
| "system": "http://snomed.info/sct", | ||
| "display": "Technique", | ||
| } | ||
|
|
||
| if validate_codes and method_code: | ||
| method, sub_msg = validate_and_substitute_code( | ||
| method_code, | ||
| method_system, | ||
| "system-collection-method", | ||
| default_method, | ||
| ) | ||
| if sub_msg: | ||
| substitution_messages.append(f"method: {sub_msg}") | ||
| else: | ||
| method = ( | ||
| parse_code(method_code, method_system, method_display) | ||
| if method_code | ||
| else None | ||
| ) | ||
|
|
||
| permitted_unit = parse_code( | ||
| row.get("permitted_unit_code"), | ||
| row.get("permitted_unit_system", "http://unitsofmeasure.org"), | ||
| row.get("permitted_unit_display"), | ||
| ) | ||
|
|
||
| component = [] | ||
| if row.get("component"): | ||
| try: | ||
| component = json.loads(row["component"]) | ||
| if not isinstance(component, list): | ||
| component = [] | ||
| except (json.JSONDecodeError, TypeError): | ||
| logger.warning("Invalid component JSON for %s", row.get("title")) | ||
|
|
||
| qualified_ranges = [] | ||
| if row.get("qualified_ranges"): | ||
| try: | ||
| qualified_ranges = json.loads(row["qualified_ranges"]) | ||
| if not isinstance(qualified_ranges, list): | ||
| qualified_ranges = [] | ||
| except (json.JSONDecodeError, TypeError): | ||
| logger.warning( | ||
| "Invalid qualified_ranges JSON for %s", row.get("title") | ||
| ) | ||
|
|
||
| title = normalize_title(row["title"]) | ||
| slug_value = create_slug(title) | ||
|
|
||
| return { | ||
| "title": title, | ||
| "slug_value": slug_value, | ||
| "status": row.get("status", "active"), | ||
| "description": row.get("description", ""), | ||
| "category": row.get("category", "laboratory"), | ||
| "code": code, | ||
| "permitted_data_type": row.get("permitted_data_type", "string"), | ||
| "body_site": body_site, | ||
| "method": method, | ||
| "permitted_unit": permitted_unit, | ||
| "derived_from_uri": row.get("derived_from_uri", ""), | ||
| "component": component, | ||
| "qualified_ranges": qualified_ranges, | ||
| "substitutions": "; ".join(substitution_messages) | ||
| if substitution_messages | ||
| else "", | ||
| } | ||
|
|
||
| except (KeyError, ValueError) as e: | ||
| error_message = f"Failed to process row: {e}" | ||
| raise ValueError(error_message) from e | ||
| except Exception as e: | ||
| error_message = f"Unexpected error processing row: {e}" | ||
| raise RuntimeError(error_message) from e | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pipeline failure: process_row exceeds complexity limits.
The linter reports PLR0912 (too many branches: 15 > 12) and PLR0915 (too many statements: 53 > 50). Consider extracting JSON parsing, code validation, and data assembly into separate helper methods.
🔎 Suggested extraction points
def _parse_json_field(self, row: dict, field_name: str, title: str) -> list:
"""Parse optional JSON field, returning empty list on failure."""
if not row.get(field_name):
return []
try:
parsed = json.loads(row[field_name])
return parsed if isinstance(parsed, list) else []
except (json.JSONDecodeError, TypeError):
logger.warning("Invalid %s JSON for %s", field_name, title)
return []
def _validate_observation_code(
self, code_value: str | None, code_system: str, code_display: str | None,
validate_codes: bool, default_code: dict
) -> tuple[dict, list[str]]:
"""Validate and return code with any substitution messages."""
# extraction of lines 149-160
...🧰 Tools
🪛 GitHub Actions: Lint Code Base
[error] 123-123: PLR0912 Too many branches (15 > 12)
[error] 123-123: PLR0915 Too many statements (53 > 50)
🪛 Ruff (0.14.10)
124-124: Unused method argument: facility
(ARG002)
134-134: Abstract raise to an inner function
(TRY301)
134-134: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
In care/emr/management/commands/load_observation_definition.py around lines 123
to 252, the process_row method is over the PLR complexity/statement limits;
extract the JSON parsing blocks, the code/method validation logic, and the final
data-assembly into small helper methods to reduce branches and statements.
Implement a _parse_json_field(self, row, field_name, title) that returns a list
(empty on missing/invalid JSON) and logs the same warnings, and a
_validate_observation_code(self, code_value, code_system, code_display,
validate_codes, default_code) that encapsulates the validate_and_substitute_code
vs parse_code logic and returns (code_dict, substitution_message_or_empty). Then
replace the inline blocks for component, qualified_ranges, code, and method with
calls to those helpers and build the returned dict using a short, single
assembly block; keep exception handling but remove duplicated logic so
process_row stays under the branch/statement thresholds.
| type_tested = { | ||
| "is_derived": bool(row.get("is_derived", False)), | ||
| "preference": row.get("preference", "preferred"), | ||
| "retention_time": retention_time, | ||
| "single_use": bool(row.get("single_use", True)), | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Boolean coercion from CSV strings will not behave as expected.
CSV values are strings, so bool(row.get("is_derived", False)) will return True for any non-empty string including "False", "false", or "0". This could lead to incorrect data being stored.
🔎 Proposed fix
+def parse_bool(value, default: bool = False) -> bool:
+ """Parse boolean from CSV string."""
+ if isinstance(value, bool):
+ return value
+ if not value:
+ return default
+ return str(value).lower() in ("true", "1", "yes")
+
type_tested = {
- "is_derived": bool(row.get("is_derived", False)),
+ "is_derived": parse_bool(row.get("is_derived"), False),
"preference": row.get("preference", "preferred"),
"retention_time": retention_time,
- "single_use": bool(row.get("single_use", True)),
+ "single_use": parse_bool(row.get("single_use"), True),
}🤖 Prompt for AI Agents
In care/emr/management/commands/load_specimen_definition.py around lines 185 to
190, the code uses bool(row.get(...)) which treats any non-empty CSV string
(e.g. "False") as True; change the boolean coercion to parse the CSV string
explicitly by reading the raw value, handling None, trimming and lowercasing it,
and then comparing against an explicit truth set (e.g.
{"true","1","yes","y","t"}) for both is_derived and single_use so
"false"/"0"/empty string become False while valid truthy strings become True.
| def handle(self, *args, **options): | ||
| start_time = datetime.now(tz=UTC) | ||
|
|
||
| # Set logging level | ||
| if options["verbosity"] == 0: | ||
| logger.setLevel(logging.ERROR) | ||
| elif options["verbosity"] == 1: | ||
| logger.setLevel(logging.INFO) | ||
| else: | ||
| logger.setLevel(logging.DEBUG) | ||
|
|
||
| try: | ||
| facility = Facility.objects.get(external_id=options["facility"]) | ||
| logger.info("Loading specimens for facility: %s", facility.name) | ||
|
|
||
| rows = self.load_data(options) | ||
| logger.info("Loaded %d rows from source", len(rows)) | ||
|
|
||
| if not rows: | ||
| self.stdout.write(self.style.WARNING("No rows found in source")) | ||
| return | ||
|
|
||
| batch_size = options["batch_size"] | ||
| total_rows = len(rows) | ||
| successful = [] | ||
| failed = [] | ||
| output_rows = [] | ||
|
|
||
| for i in range(0, total_rows, batch_size): | ||
| batch = rows[i : i + batch_size] | ||
| batch_num = (i // batch_size) + 1 | ||
| total_batches = (total_rows + batch_size - 1) // batch_size | ||
|
|
||
| logger.info( | ||
| "Processing batch %d/%d (rows %d-%d)", | ||
| batch_num, | ||
| total_batches, | ||
| i + 1, | ||
| min(i + batch_size, total_rows), | ||
| ) | ||
|
|
||
| for row in batch: | ||
| row_title = row.get("title", "Unknown") | ||
| slug_value = "" | ||
|
|
||
| try: | ||
| data = self.process_row(row, facility) | ||
| slug_value = data["slug_value"] | ||
|
|
||
| self.create_specimen_definition(data, facility, None) | ||
|
|
||
| successful.append(slug_value) | ||
| output_rows.append( | ||
| { | ||
| "title": data["title"], | ||
| "slug_value": slug_value, | ||
| "status": "Success", | ||
| "error": "", | ||
| } | ||
| ) | ||
|
|
||
| except Exception as e: | ||
| logger.error("Error processing row '%s': %s", row_title, e) | ||
| failed.append(row_title) | ||
| output_rows.append( | ||
| { | ||
| "title": row_title, | ||
| "slug_value": slug_value, | ||
| "status": "Failed", | ||
| "error": str(e), | ||
| } | ||
| ) | ||
|
|
||
| output_path = options.get("output") or default_output_path | ||
| if output_path: | ||
| write_output_csv( | ||
| output_path, | ||
| output_rows, | ||
| ["title", "slug_value", "status", "error"], | ||
| ) | ||
|
|
||
| self.stdout.write("\n=== Summary ===") | ||
| self.stdout.write(f"Total rows: {total_rows}") | ||
| self.stdout.write(self.style.SUCCESS(f"Successful: {len(successful)}")) | ||
| self.stdout.write(self.style.ERROR(f"Failed: {len(failed)}")) | ||
| self.stdout.write(f"Time taken: {datetime.now(tz=UTC) - start_time}") | ||
| self.stdout.write( | ||
| self.style.SUCCESS("Specimen definitions loaded successfully") | ||
| ) | ||
|
|
||
| except Exception as e: | ||
| logger.exception("Error in main process") | ||
| error_message = f"Error in main process: {e}" | ||
| self.stdout.write(self.style.ERROR(error_message)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pipeline failure: handle method exceeds complexity limits.
The linter reports PLR0912 (too many branches) and PLR0915 (too many statements). Consider extracting the batch processing loop and summary output into separate helper methods to reduce complexity and improve readability.
🔎 Suggested refactor approach
def _process_batch(self, batch: list[dict], facility: Facility) -> tuple[list, list, list]:
"""Process a batch of rows. Returns (successful, failed, output_rows)."""
successful = []
failed = []
output_rows = []
for row in batch:
row_title = row.get("title", "Unknown")
slug_value = ""
try:
data = self.process_row(row)
slug_value = data["slug_value"]
self.create_specimen_definition(data, facility, None)
successful.append(slug_value)
output_rows.append({
"title": data["title"],
"slug_value": slug_value,
"status": "Success",
"error": "",
})
except Exception as e:
logger.exception("Error processing row '%s'", row_title)
failed.append(row_title)
output_rows.append({
"title": row_title,
"slug_value": slug_value,
"status": "Failed",
"error": str(e),
})
return successful, failed, output_rows
def _print_summary(self, total_rows: int, successful: list, failed: list, start_time):
"""Print final summary."""
self.stdout.write("\n=== Summary ===")
self.stdout.write(f"Total rows: {total_rows}")
self.stdout.write(self.style.SUCCESS(f"Successful: {len(successful)}"))
self.stdout.write(self.style.ERROR(f"Failed: {len(failed)}"))
self.stdout.write(f"Time taken: {datetime.now(tz=UTC) - start_time}")
self.stdout.write(self.style.SUCCESS("Specimen definitions loaded successfully"))🧰 Tools
🪛 Ruff (0.14.10)
258-258: Unused method argument: args
(ARG002)
319-319: Do not catch blind exception: Exception
(BLE001)
320-320: Use logging.exception instead of logging.error
Replace with exception
(TRY400)
🤖 Prompt for AI Agents
In care/emr/management/commands/load_specimen_definition.py around lines
258-351, the handle method is too large/complex (PLR0912/PLR0915); extract the
inner batch loop into a helper like _process_batch(self, batch: list[dict],
facility: Facility) -> tuple[list, list, list] that processes a single batch,
returns (successful, failed, output_rows) and uses logger.exception for row
errors, and extract the final printing into _print_summary(self, total_rows:
int, successful: list, failed: list, start_time); update handle to iterate
batches, call _process_batch for each batch and aggregate returned lists, then
call _print_summary at the end and keep CSV output logic unchanged.
Proposed Changes
Converted FE scripts to BE; for loading activity definition, observation definition, specimen definition and charge item definitions.
Merge Checklist
/docsOnly PR's with test cases included and passing lint and test pipelines will be reviewed
@ohcnetwork/care-backend-maintainers @ohcnetwork/care-backend-admins
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.