Daily Outbound Report

// system_metrics

sources_merged

carriers / day

scheduled_runs

17:00 · 18:00 · 19:00

manual_steps

fully automated

.xlsb

output_format

live pivot table

45m

time_saved

per day

// dev_timeline

BRD REVIEW

6 hours

DEVELOPMENT

6 days

TESTING / QA

1 week

GO-LIVE

Active

// data_sources → output

CAIOD Report.xlsxreceived

CBLTL Daily Exception List.xlsxreceived

CBMR Daily Exception List.xlsxreceived

CCDaily Manifest.xlsxreceived

CDDaily Manifest Report.csvreceived

TRANSFORM
+ MERGE

Consolidated_Daily_Manifest.xlsb

schema19-col standardized

pivotauto-refreshed via xlwings

format.xlsb · freeze panes · filters

emailpivot HTML embedded in body

storagedate-stamped folder / day

// technical_highlights

SCHEDULE

3-Run Daily Dispatch

One bat file per run passes a mode flag via sys.argv. Threshold shifts per run (≥1700 / ≥1800), no external state.

DETECTION

Stateless Update Signal

_HHMM suffix embedded at download time. Re-runs check the filename timestamp. No lock files, no DB.

RESILIENCE

Fault-Tolerant Sources

Each source wrapped in its own try/except. Missing files are skipped; RuntimeError only when dfs is empty. Alert and report are independent.

PERF

POP3 Early Termination

Scans newest-first. Exits when all 5 targets are matched or the first email older than today is reached.

OUTPUT

Template-Driven Excel

Data written at A2 into a pre-built template. xlwings refreshes the pivot before .xlsb export, preserving all formatting.

Pivot in Email Body

Post-refresh pivot rendered as an inline HTML table in the email body. Recipients see the summary without opening the attachment.

// dependencies

pythonruntime

pandastransform

xlwingsexcel automation

openpyxlexcel utils

poplibpop3 inbound

smtplibsmtp outbound

pytztz handling

python-dotenvcredentials

claude-codeAI dev

// execution_flow

click to zoom in

// error_handling

POP3 Connection Failure

Returns DownloadResult(success=False). Pipeline continues with whatever files were already downloaded.

Missing Source File

Source-level FileNotFoundError → skip & continue. Only raises RuntimeError if all sources are missing.

Email Send Failure

Missing credentials or SMTP error → logged with full traceback, returns False. No crash.

Bad Email Date Header

Unparseable Date header → silently skipped. Scan continues to the next message.

POP3 Quit Error

Close failure logged as warning. Connection cleared in finally.

Partial Download

Returns success=False with partial received_times. Failure alert lists what arrived and what didn't.

// code_highlights · main.example.py

In-Place Address Column Expansion provider_d_report()

# Split one address string into 5 structured fields
# and splice them back at the original column position
d_col = col_dict["D"]
split_df = (
    df[d_col].str.split(",", expand=True)
             .apply(lambda x: x.str.strip())
)
split_df.columns = [
    "ConsigneeAddress", "City",
    "State", "Country", "Zip"
]
d_idx = list(df.columns).index(d_col)
df.drop(columns=[d_col], inplace=True)
for i, col in enumerate(split_df.columns):
    df.insert(d_idx + i, col, split_df[col])

# Null-fill keeps pivot clean — no (Blank) label rows
df[col_dict["K"]] = df[col_dict["K"]].fillna(
    "SHIPMENT EN-ROUTE-TO-DEST"
)

Fault-Tolerant Source Dispatch Table combine_excel()

# Each source is isolated — one failure never blocks others
dfs = []
for label, loader in [
    ("Provider A", lambda: pd.read_excel(
        find_file("Provider A*.xlsx"))),
    ("Provider B", lambda: pd.read_excel(
        find_file("Provider B*.xlsx")).iloc[:, 0:19]),
    ("Provider C", provider_c_report),
    ("Provider D", provider_d_report),
    ("Provider E", lambda: pd.read_excel(
        find_file("Provider E*.xlsx")).iloc[:, 0:19]),
]:
    try:
        df = loader()
        df.columns = template_cols[:len(df.columns)]
        dfs.append(df)
    except FileNotFoundError:
        print(f"[SKIP] {label} not found")

if not dfs:
    raise RuntimeError("No files available")
df_result = pd.concat(dfs, ignore_index=True)

Position-Safe Column Block Reorder provider_c_report()

# col_dict rebuilt after every mutation —
# letter keys stay stable regardless of insert/drop order
col_dict = {
    get_column_letter(i + 1): col
    for i, col in enumerate(df.columns)
}
# Extract block, drop, re-insert at target position
block = df.loc[
    :, col_dict["I"]:col_dict["N"]
].copy()
df.drop(columns=block.columns.tolist(), inplace=True)
col_dict = {
    get_column_letter(i + 1): col
    for i, col in enumerate(df.columns)
}
c_idx = list(df.columns).index(col_dict["C"])
for i, col in enumerate(block.columns):
    df.insert(c_idx + i, col, block[col].values)

Stateless Update Detection via Filename Timestamp __main__

# Files saved as name_HHMM.ext at download time.
# Each run checks only files newer than its threshold
# — no lock files, no external state required.
mode = sys.argv[1] if len(sys.argv) > 1 else "initial"

has_after_1700 = any(
    int(Path(f).stem.rsplit('_', 1)[-1]) >= 1700
    for f in result.downloaded_files
)
if not xlsb.exists():
    send_report(..., is_updated=False)
elif mode == "second" and has_after_1700:
    send_report(..., is_updated=True)
elif mode == "third" and has_after_1800:
    send_report(..., is_updated=True)
else:
    send_skip(...)

// dev_process · Claude Code Conversations

Multi-Run Idempotency: Logic First

Extending to 3 daily runs. Files already have a _HHMM suffix at download time — use that as the update signal, no new state. First run gate is xlsb.exists(). Runs 2 and 3 use per-run timestamp thresholds. Verify all 6 cases before touching code.

All 6 cases verified — no overlap. Implemented sys.argv mode dispatch (initial / second / third) with thresholds ≥1700 / ≥1800. Created run_initial.bat, run_second.bat, run_third.bat.

Change Impact Analysis Before Edit

The name field in EMAIL_TARGETS — I know it renders in the email body, but does it get used anywhere else? Matching logic uses contains and sender, so I'm assuming it's safe to rename. Confirm before I change it.

Also used in log output and as the received_times dict key — but since both writer and reader pull from the same config, renaming is safe. Matching is unaffected.

Good. Apply carrier prefixes: CA / CB / CC / CD.

Partial Failure Handling: Decoupled Paths

Not all 5 files arrive every day. The consolidation, report send, and failure alert must be independent. A missing file shouldn't block the report, and a send failure shouldn't suppress the alert. RuntimeError only if every source is missing.

Each loader isolated in try/except FileNotFoundError inside a dispatch table. Alert and report are independent. Neither blocks the other. RuntimeError raised only when dfs is empty.

Module Split: API Contract Preserved

Split the email pipeline into email_setup, downloader, sender. Public function signatures and return types must stay identical. main.py should only need import line changes.

Done. All signatures and return types preserved. main.py diff is 3 import lines, zero logic changes.