Multi-carrier daily report automation · Python · POP3 → XLSX → SMTP
// system_metrics
// dev_timeline
// data_sources → output
// technical_highlights
sys.argv. Threshold shifts per run (≥1700 / ≥1800), no external state._HHMM suffix embedded at download time. Re-runs check the filename timestamp. No lock files, no DB.try/except. Missing files are skipped; RuntimeError only when dfs is empty. Alert and report are independent..xlsb export, preserving all formatting.// dependencies
// execution_flow
click to zoom in
// error_handling
DownloadResult(success=False). Pipeline continues with whatever files were already downloaded.FileNotFoundError → skip & continue. Only raises RuntimeError if all sources are missing.False. No crash.Date header → silently skipped. Scan continues to the next message.finally.success=False with partial received_times. Failure alert lists what arrived and what didn't.// code_highlights · main.example.py
# Split one address string into 5 structured fields # and splice them back at the original column position d_col = col_dict["D"] split_df = ( df[d_col].str.split(",", expand=True) .apply(lambda x: x.str.strip()) ) split_df.columns = [ "ConsigneeAddress", "City", "State", "Country", "Zip" ] d_idx = list(df.columns).index(d_col) df.drop(columns=[d_col], inplace=True) for i, col in enumerate(split_df.columns): df.insert(d_idx + i, col, split_df[col]) # Null-fill keeps pivot clean — no (Blank) label rows df[col_dict["K"]] = df[col_dict["K"]].fillna( "SHIPMENT EN-ROUTE-TO-DEST" )
# Each source is isolated — one failure never blocks others dfs = [] for label, loader in [ ("Provider A", lambda: pd.read_excel( find_file("Provider A*.xlsx"))), ("Provider B", lambda: pd.read_excel( find_file("Provider B*.xlsx")).iloc[:, 0:19]), ("Provider C", provider_c_report), ("Provider D", provider_d_report), ("Provider E", lambda: pd.read_excel( find_file("Provider E*.xlsx")).iloc[:, 0:19]), ]: try: df = loader() df.columns = template_cols[:len(df.columns)] dfs.append(df) except FileNotFoundError: print(f"[SKIP] {label} not found") if not dfs: raise RuntimeError("No files available") df_result = pd.concat(dfs, ignore_index=True)
# col_dict rebuilt after every mutation — # letter keys stay stable regardless of insert/drop order col_dict = { get_column_letter(i + 1): col for i, col in enumerate(df.columns) } # Extract block, drop, re-insert at target position block = df.loc[ :, col_dict["I"]:col_dict["N"] ].copy() df.drop(columns=block.columns.tolist(), inplace=True) col_dict = { get_column_letter(i + 1): col for i, col in enumerate(df.columns) } c_idx = list(df.columns).index(col_dict["C"]) for i, col in enumerate(block.columns): df.insert(c_idx + i, col, block[col].values)
# Files saved as name_HHMM.ext at download time. # Each run checks only files newer than its threshold # — no lock files, no external state required. mode = sys.argv[1] if len(sys.argv) > 1 else "initial" has_after_1700 = any( int(Path(f).stem.rsplit('_', 1)[-1]) >= 1700 for f in result.downloaded_files ) if not xlsb.exists(): send_report(..., is_updated=False) elif mode == "second" and has_after_1700: send_report(..., is_updated=True) elif mode == "third" and has_after_1800: send_report(..., is_updated=True) else: send_skip(...)
// dev_process · Claude Code Conversations
_HHMM suffix at download time — use that as the update signal, no new state.
First run gate is xlsb.exists(). Runs 2 and 3 use per-run timestamp thresholds.
Verify all 6 cases before touching code.
sys.argv mode dispatch
(initial / second / third) with thresholds ≥1700 / ≥1800.
Created run_initial.bat, run_second.bat, run_third.bat.
name field in EMAIL_TARGETS — I know it renders in the email body,
but does it get used anywhere else? Matching logic uses contains and sender,
so I'm assuming it's safe to rename. Confirm before I change it.
received_times dict key —
but since both writer and reader pull from the same config, renaming is safe.
Matching is unaffected.
RuntimeError only if every source is missing.
try/except FileNotFoundError inside a dispatch table.
Alert and report are independent. Neither blocks the other.
RuntimeError raised only when dfs is empty.
email_setup, downloader, sender.
Public function signatures and return types must stay identical.
main.py should only need import line changes.
main.py diff is 3 import lines, zero logic changes.