--- name: finance description: 财务对账技能。支持银行对账、应收对账、应付对账、内部往来对账等。通过自然语言实现账单导入、差异分析、对账报告生成。当用户提到对账、银行流水、应收应付、账单核对、财务核销相关任务时自动激活。 --- # 财务对账技能 ## 功能概述 - **银行对账**: 银行流水 vs 系统收付款记录匹配 - **应收对账**: 客户对账单生成与核对 - **应付对账**: 供应商对账单核对 - **内部对账**: 公司间/部门间往来核对 - **差异分析**: 自动识别未达账项、差异原因 - **对账报告**: 生成标准化对账报告 - **扫描单据OCR**: PDF/图片扫描件识别,提取订单号与对账单匹配(适用于诉讼证据核对) --- ## 对账类型 ### 1. 银行对账 | 对账项 | 说明 | |--------|------| | 银行已收/企业未收 | 银行已入账,系统未记录 | | 企业已收/银行未收 | 系统已记录,银行未入账 | | 银行已付/企业未付 | 银行已扣款,系统未记录 | | 企业已付/银行未付 | 系统已记录,银行未扣款 | ### 2. 应收对账 - 客户欠款余额核对 - 发票与回款匹配 - 账龄分析 ### 3. 应付对账 - 供应商应付余额核对 - 采购入库与付款匹配 - 预付款冲销 --- ## 数据格式要求 ### 银行流水格式 ```csv 交易日期,交易类型,摘要,金额,余额,对方账户,对方名称,交易流水号 2026-01-15,收入,货款,50000.00,150000.00,622848******1234,深圳XX公司,TXN20260115001 2026-01-16,支出,采购款,-30000.00,120000.00,622848******5678,广州YY公司,TXN20260116001 ``` ### 系统收付款格式 ```csv 单据日期,单据类型,单据编号,金额,客户/供应商,备注,关联流水号 2026-01-15,收款单,SK20260115001,50000.00,深圳XX公司,货款,TXN20260115001 2026-01-16,付款单,FK20260116001,30000.00,广州YY公司,采购款, ``` --- ## 自然语言操作示例 ### 银行对账 | 用户说 | 执行操作 | |--------|----------| | "导入银行流水" | 解析银行导出的 Excel/CSV | | "和系统记录对账" | 自动匹配流水号/金额/日期 | | "找出未达账项" | 识别双方差异记录 | | "生成银行余额调节表" | 输出标准调节表 | ### 应收对账 | 用户说 | 执行操作 | |--------|----------| | "生成客户对账单" | 按客户汇总应收明细 | | "核对XX公司的应收款" | 单客户对账 | | "应收账龄分析" | 按账龄分段统计 | | "哪些发票还没收到款" | 查询未核销发票 | ### 应付对账 | 用户说 | 执行操作 | |--------|----------| | "核对供应商对账单" | 导入并匹配供应商账单 | | "本月应付汇总" | 按供应商统计应付 | | "哪些采购单还没付款" | 查询未付款采购单 | ### 扫描单据OCR对账 | 用户说 | 执行操作 | |--------|----------| | "识别扫描件与对账单匹配" | OCR 提取 PDF 订单号并与 Excel 对账 | | "承运商起诉了,核对证据" | 诉讼证据对账分析 | | "PDF发票识别" | OCR 提取发票信息 | | "找出对账单没有的订单" | 差异订单识别 | --- ## Python 代码模板 ### 银行对账核心逻辑 ```python import pandas as pd from datetime import datetime def load_bank_statement(file_path: str) -> pd.DataFrame: """加载银行流水""" df = pd.read_excel(file_path) if file_path.endswith('.xlsx') else pd.read_csv(file_path) # 标准化列名 column_mapping = { '交易日期': 'date', '金额': 'amount', '交易流水号': 'ref_no', '对方名称': 'counterparty', '摘要': 'memo' } df = df.rename(columns=column_mapping) df['date'] = pd.to_datetime(df['date']) df['source'] = 'bank' return df def load_system_records(file_path: str) -> pd.DataFrame: """加载系统收付款记录""" df = pd.read_excel(file_path) if file_path.endswith('.xlsx') else pd.read_csv(file_path) column_mapping = { '单据日期': 'date', '金额': 'amount', '关联流水号': 'ref_no', '客户/供应商': 'counterparty', '单据编号': 'doc_no' } df = df.rename(columns=column_mapping) df['date'] = pd.to_datetime(df['date']) df['source'] = 'system' return df def reconcile(bank_df: pd.DataFrame, system_df: pd.DataFrame) -> dict: """ 执行对账 Returns: { 'matched': 已匹配记录, 'bank_only': 银行有/系统无 (银行未达), 'system_only': 系统有/银行无 (企业未达), 'summary': 对账汇总 } """ # 方法1: 按流水号精确匹配 bank_with_ref = bank_df[bank_df['ref_no'].notna() & (bank_df['ref_no'] != '')] system_with_ref = system_df[system_df['ref_no'].notna() & (system_df['ref_no'] != '')] matched_by_ref = pd.merge( bank_with_ref, system_with_ref, on='ref_no', suffixes=('_bank', '_system'), how='inner' ) matched_refs = set(matched_by_ref['ref_no']) # 方法2: 无流水号的按金额+日期模糊匹配 bank_unmatched = bank_df[~bank_df['ref_no'].isin(matched_refs)] system_unmatched = system_df[~system_df['ref_no'].isin(matched_refs)] # 按金额和日期(±3天)匹配 fuzzy_matched = [] bank_remaining = bank_unmatched.copy() system_remaining = system_unmatched.copy() for idx, bank_row in bank_unmatched.iterrows(): for sys_idx, sys_row in system_remaining.iterrows(): if (abs(bank_row['amount'] - sys_row['amount']) < 0.01 and abs((bank_row['date'] - sys_row['date']).days) <= 3): fuzzy_matched.append({ 'bank_idx': idx, 'system_idx': sys_idx, 'amount': bank_row['amount'], 'bank_date': bank_row['date'], 'system_date': sys_row['date'] }) system_remaining = system_remaining.drop(sys_idx) bank_remaining = bank_remaining.drop(idx) break # 未匹配项 bank_only = bank_remaining # 银行有/系统无 system_only = system_remaining # 系统有/银行无 # 汇总 summary = { 'bank_total': bank_df['amount'].sum(), 'system_total': system_df['amount'].sum(), 'matched_count': len(matched_by_ref) + len(fuzzy_matched), 'bank_only_count': len(bank_only), 'bank_only_amount': bank_only['amount'].sum() if len(bank_only) > 0 else 0, 'system_only_count': len(system_only), 'system_only_amount': system_only['amount'].sum() if len(system_only) > 0 else 0, 'difference': bank_df['amount'].sum() - system_df['amount'].sum() } return { 'matched': pd.concat([matched_by_ref, pd.DataFrame(fuzzy_matched)]), 'bank_only': bank_only, 'system_only': system_only, 'summary': summary } ``` ### 银行余额调节表 ```python def generate_bank_reconciliation_report( bank_balance: float, book_balance: float, bank_only: pd.DataFrame, system_only: pd.DataFrame, as_of_date: str ) -> str: """生成银行余额调节表""" # 分类未达账项 bank_receipts_not_in_book = bank_only[bank_only['amount'] > 0]['amount'].sum() bank_payments_not_in_book = bank_only[bank_only['amount'] < 0]['amount'].sum() book_receipts_not_in_bank = system_only[system_only['amount'] > 0]['amount'].sum() book_payments_not_in_bank = system_only[system_only['amount'] < 0]['amount'].sum() report = f""" ================================================================================ 银行存款余额调节表 截止日期: {as_of_date} ================================================================================ 一、银行对账单余额 {bank_balance:>15,.2f} 加: 企业已收/银行未收 {book_receipts_not_in_bank:>15,.2f} 减: 企业已付/银行未付 {abs(book_payments_not_in_bank):>15,.2f} ───────────────── 调节后余额 {bank_balance + book_receipts_not_in_bank + book_payments_not_in_bank:>15,.2f} -------------------------------------------------------------------------------- 二、企业账面余额 {book_balance:>15,.2f} 加: 银行已收/企业未收 {bank_receipts_not_in_book:>15,.2f} 减: 银行已付/企业未付 {abs(bank_payments_not_in_book):>15,.2f} ───────────────── 调节后余额 {book_balance + bank_receipts_not_in_book + bank_payments_not_in_book:>15,.2f} ================================================================================ 未达账项明细 ================================================================================ 【银行已收/企业未收】 """ if len(bank_only[bank_only['amount'] > 0]) > 0: for _, row in bank_only[bank_only['amount'] > 0].iterrows(): report += f" {row['date'].strftime('%Y-%m-%d')} {row.get('counterparty', ''):<20} {row['amount']:>12,.2f}\n" else: report += " (无)\n" report += "\n【银行已付/企业未付】\n" if len(bank_only[bank_only['amount'] < 0]) > 0: for _, row in bank_only[bank_only['amount'] < 0].iterrows(): report += f" {row['date'].strftime('%Y-%m-%d')} {row.get('counterparty', ''):<20} {row['amount']:>12,.2f}\n" else: report += " (无)\n" report += "\n【企业已收/银行未收】\n" if len(system_only[system_only['amount'] > 0]) > 0: for _, row in system_only[system_only['amount'] > 0].iterrows(): report += f" {row['date'].strftime('%Y-%m-%d')} {row.get('counterparty', ''):<20} {row['amount']:>12,.2f}\n" else: report += " (无)\n" report += "\n【企业已付/银行未付】\n" if len(system_only[system_only['amount'] < 0]) > 0: for _, row in system_only[system_only['amount'] < 0].iterrows(): report += f" {row['date'].strftime('%Y-%m-%d')} {row.get('counterparty', ''):<20} {row['amount']:>12,.2f}\n" else: report += " (无)\n" return report ``` ### 应收对账 ```python def generate_ar_statement( customer_name: str, transactions: pd.DataFrame, as_of_date: str ) -> str: """ 生成客户对账单 transactions 格式: - date: 日期 - doc_type: 单据类型 (发票/收款/退货) - doc_no: 单据编号 - debit: 借方(应收增加) - credit: 贷方(应收减少) - memo: 摘要 """ df = transactions.sort_values('date') report = f""" ================================================================================ 客户对账单 ================================================================================ 客户名称: {customer_name} 对账期间: {df['date'].min().strftime('%Y-%m-%d')} 至 {as_of_date} ================================================================================ 日期 单据类型 单据编号 借方 贷方 余额 -------------------------------------------------------------------------------- """ balance = 0 for _, row in df.iterrows(): debit = row.get('debit', 0) or 0 credit = row.get('credit', 0) or 0 balance += debit - credit report += f"{row['date'].strftime('%Y-%m-%d')} {row['doc_type']:<8} {row['doc_no']:<16} {debit:>12,.2f} {credit:>12,.2f} {balance:>12,.2f}\n" report += f""" -------------------------------------------------------------------------------- 合计 {df['debit'].sum():>12,.2f} {df['credit'].sum():>12,.2f} {balance:>12,.2f} ================================================================================ 期末应收余额: {balance:,.2f} 请贵司核对以上账目,如有异议请于收到对账单后 7 日内书面告知。 如无异议,视为确认。 对账联系人: _______________ 联系电话: _______________ """ return report ``` ### 账龄分析 ```python from datetime import datetime, timedelta def aging_analysis( receivables: pd.DataFrame, as_of_date: str, aging_buckets: list = [30, 60, 90, 180, 365] ) -> pd.DataFrame: """ 应收账龄分析 receivables 格式: - customer: 客户 - invoice_date: 发票日期 - amount: 未收金额 """ as_of = datetime.strptime(as_of_date, '%Y-%m-%d') def get_aging_bucket(invoice_date): days = (as_of - invoice_date).days for i, bucket in enumerate(aging_buckets): if days <= bucket: if i == 0: return f'0-{bucket}天' else: return f'{aging_buckets[i-1]+1}-{bucket}天' return f'{aging_buckets[-1]+1}天以上' df = receivables.copy() df['invoice_date'] = pd.to_datetime(df['invoice_date']) df['aging_days'] = (as_of - df['invoice_date']).dt.days df['aging_bucket'] = df['invoice_date'].apply(get_aging_bucket) # 按客户和账龄分组汇总 summary = df.pivot_table( index='customer', columns='aging_bucket', values='amount', aggfunc='sum', fill_value=0 ) # 添加合计列 summary['合计'] = summary.sum(axis=1) # 添加合计行 summary.loc['合计'] = summary.sum() return summary ``` --- ## 对账报告模板 ### Excel 对账报告 ```python from openpyxl import Workbook from openpyxl.styles import Font, Alignment, Border, Side, PatternFill def export_reconciliation_to_excel( result: dict, output_path: str, report_date: str ): """导出对账结果到 Excel""" wb = Workbook() # Sheet1: 对账汇总 ws_summary = wb.active ws_summary.title = "对账汇总" summary = result['summary'] ws_summary['A1'] = '对账汇总报告' ws_summary['A1'].font = Font(bold=True, size=14) ws_summary['A2'] = f'对账日期: {report_date}' headers = ['项目', '金额', '笔数'] for col, header in enumerate(headers, 1): ws_summary.cell(row=4, column=col, value=header) data = [ ('银行流水合计', summary['bank_total'], '-'), ('系统记录合计', summary['system_total'], '-'), ('已匹配', '-', summary['matched_count']), ('银行有/系统无', summary['bank_only_amount'], summary['bank_only_count']), ('系统有/银行无', summary['system_only_amount'], summary['system_only_count']), ('差异金额', summary['difference'], '-'), ] for row, (item, amount, count) in enumerate(data, 5): ws_summary.cell(row=row, column=1, value=item) ws_summary.cell(row=row, column=2, value=amount) ws_summary.cell(row=row, column=3, value=count) # Sheet2: 银行未达账项 ws_bank = wb.create_sheet("银行未达账项") if len(result['bank_only']) > 0: for col, header in enumerate(result['bank_only'].columns, 1): ws_bank.cell(row=1, column=col, value=header) for row, data_row in enumerate(result['bank_only'].values, 2): for col, value in enumerate(data_row, 1): ws_bank.cell(row=row, column=col, value=value) # Sheet3: 企业未达账项 ws_system = wb.create_sheet("企业未达账项") if len(result['system_only']) > 0: for col, header in enumerate(result['system_only'].columns, 1): ws_system.cell(row=1, column=col, value=header) for row, data_row in enumerate(result['system_only'].values, 2): for col, value in enumerate(data_row, 1): ws_system.cell(row=row, column=col, value=value) wb.save(output_path) print(f"对账报告已保存: {output_path}") ``` --- ## 常见对账场景 ### 场景1: 月末银行对账 ```python # 1. 加载数据 bank_df = load_bank_statement('银行流水_202601.xlsx') system_df = load_system_records('收付款记录_202601.xlsx') # 2. 执行对账 result = reconcile(bank_df, system_df) # 3. 查看汇总 print(f"已匹配: {result['summary']['matched_count']} 笔") print(f"银行未达: {result['summary']['bank_only_count']} 笔, 金额 {result['summary']['bank_only_amount']:,.2f}") print(f"企业未达: {result['summary']['system_only_count']} 笔, 金额 {result['summary']['system_only_amount']:,.2f}") # 4. 生成调节表 report = generate_bank_reconciliation_report( bank_balance=150000.00, # 银行对账单余额 book_balance=145000.00, # 企业账面余额 bank_only=result['bank_only'], system_only=result['system_only'], as_of_date='2026-01-31' ) print(report) # 5. 导出 Excel export_reconciliation_to_excel(result, '银行对账_202601.xlsx', '2026-01-31') ``` ### 场景2: 客户对账 ```python # 加载客户交易明细 ar_transactions = pd.read_excel('应收明细.xlsx') # 筛选特定客户 customer_data = ar_transactions[ar_transactions['customer'] == '深圳XX公司'] # 生成对账单 statement = generate_ar_statement( customer_name='深圳XX公司', transactions=customer_data, as_of_date='2026-01-31' ) print(statement) ``` ### 场景3: 账龄分析 ```python # 加载未收款数据 receivables = pd.read_excel('应收账款.xlsx') # 账龄分析 aging = aging_analysis( receivables=receivables, as_of_date='2026-01-31', aging_buckets=[30, 60, 90, 180, 365] ) print(aging) # 导出 aging.to_excel('账龄分析_202601.xlsx') ``` ### 场景4: 扫描单据OCR识别与对账匹配(诉讼证据核对) 适用于:承运商/供应商提供扫描件作为诉讼证据,需要与我司对账单进行匹配核对。 ```python from pdf2image import convert_from_path import pytesseract import pandas as pd import re def ocr_pdf_extract_orders(pdf_path: str, dpi: int = 150) -> pd.DataFrame: """ 从 PDF 扫描件中 OCR 提取订单信息 Args: pdf_path: PDF 文件路径 dpi: 图像分辨率,越高识别越准但速度越慢 Returns: DataFrame 包含 order_no, page, tc_no 等字段 """ print(f"正在转换 PDF...") images = convert_from_path(pdf_path, dpi=dpi) print(f"共 {len(images)} 页,开始 OCR 识别...") all_records = [] for i, img in enumerate(images, 1): if i % 10 == 0: print(f" 处理进度: {i}/{len(images)}") text = pytesseract.image_to_string(img, lang='chi_sim+eng') # 提取多种订单号格式(根据实际业务调整) hm_orders = re.findall(r'(HM\d{12,15})(?:-\d+)?', text) # 华住订单 hpo_orders = re.findall(r'(HPO-\d{8}-\d+)', text) # 华住采购订单 fy_orders = re.findall(r'(FY\d{9,})', text) # 凤悦订单 phgco_orders = re.findall(r'(PHGCO\d{9,})', text) # 锦江订单 tc_orders = re.findall(r'(TC\d{8,})', text) # 运单号 # 提取日期 dates = re.findall(r'(\d{4}-\d{2}-\d{2})', text) all_orders = hm_orders + hpo_orders + fy_orders + phgco_orders for order in all_orders: all_records.append({ 'order_no': order, 'page': i, 'tc_no': tc_orders[0] if tc_orders else None, 'date': dates[0] if dates else None, 'raw_text': text[:500] # 保留部分原文用于核查 }) df = pd.DataFrame(all_records) df = df.drop_duplicates(subset=['order_no']) print(f"提取完成,共 {len(df)} 条唯一订单") return df def extract_base_order_no(order_no: str) -> str: """ 提取基础订单号(移除后缀如 -1, -2) 例如: HM202505227320-1 -> HM202505227320 """ if pd.isna(order_no): return None order_str = str(order_no) match = re.match(r'(HM\d{12,15}|HPO-\d{8}-\d+|FY\d{9,}|PHGCO\d{9,})', order_str) return match.group(1) if match else order_str def reconcile_scanned_with_excel( scanned_df: pd.DataFrame, excel_path: str, main_sheet: str, pending_sheet: str = None, order_column: str = '华住/锦江订单号' ) -> dict: """ 扫描单据与 Excel 对账单匹配 Args: scanned_df: OCR 提取的订单 DataFrame excel_path: Excel 对账单路径 main_sheet: 主数据 Sheet 名称 pending_sheet: 待结算/次月结 Sheet 名称(可选) order_column: 订单号列名 Returns: { 'matched_main': 在主表中匹配的订单, 'matched_pending': 在待结算表中匹配的订单, 'not_found': 未找到的订单, 'summary': 汇总统计 } """ # 加载主表 df_main = pd.read_excel(excel_path, sheet_name=main_sheet) # 处理表头(如果第一行是表头) if df_main.iloc[0].astype(str).str.contains(order_column).any(): df_main.columns = df_main.iloc[0] df_main = df_main.iloc[1:].reset_index(drop=True) # 提取基础订单号 main_orders = set(df_main[order_column].apply(extract_base_order_no).dropna()) # 加载待结算表(如有) pending_orders = set() df_pending = None if pending_sheet: try: df_pending = pd.read_excel(excel_path, sheet_name=pending_sheet) pending_orders = set(df_pending[order_column].apply(extract_base_order_no).dropna()) except: pass all_excel_orders = main_orders | pending_orders # 扫描件订单号 scan_orders = set(scanned_df['order_no'].apply(extract_base_order_no).dropna()) # 匹配分析 matched_main = scan_orders & main_orders matched_pending = scan_orders & pending_orders not_found = scan_orders - all_excel_orders # 汇总 summary = { 'scan_total': len(scan_orders), 'matched_main_count': len(matched_main), 'matched_pending_count': len(matched_pending), 'not_found_count': len(not_found), 'match_rate': (len(matched_main) + len(matched_pending)) / len(scan_orders) * 100 if scan_orders else 0 } return { 'matched_main': matched_main, 'matched_pending': matched_pending, 'not_found': not_found, 'summary': summary, 'df_main': df_main, 'df_pending': df_pending } def generate_litigation_reconciliation_report( result: dict, scanned_df: pd.DataFrame, output_path: str ): """ 生成诉讼对账分析报告 """ summary = result['summary'] with pd.ExcelWriter(output_path, engine='openpyxl') as writer: # Sheet1: 汇总 summary_df = pd.DataFrame({ '项目': [ '扫描件订单数', '在主对账单中找到', '在待结算表中找到', '完全未找到', '匹配率' ], '数值': [ summary['scan_total'], summary['matched_main_count'], summary['matched_pending_count'], summary['not_found_count'], f"{summary['match_rate']:.1f}%" ] }) summary_df.to_excel(writer, sheet_name='对账汇总', index=False) # Sheet2: 未找到的订单 if result['not_found']: not_found_df = scanned_df[ scanned_df['order_no'].apply(extract_base_order_no).isin(result['not_found']) ][['order_no', 'page', 'tc_no', 'date']] not_found_df.columns = ['订单号', 'PDF页码', '运单号', '日期'] not_found_df.to_excel(writer, sheet_name='未找到订单', index=False) # Sheet3: 待结算订单详情 if result['matched_pending'] and result['df_pending'] is not None: pending_detail = result['df_pending'][ result['df_pending']['华住/锦江订单号'].apply(extract_base_order_no).isin(result['matched_pending']) ] pending_detail.to_excel(writer, sheet_name='待结算订单', index=False) print(f"报告已保存: {output_path}") ``` #### 使用示例 ```python # 1. 从 PDF 扫描件提取订单 pdf_path = "承运商证据_6月业务.pdf" scanned_df = ocr_pdf_extract_orders(pdf_path, dpi=150) # 2. 与 Excel 对账单匹配 result = reconcile_scanned_with_excel( scanned_df=scanned_df, excel_path="对账单_2025年6月.xlsx", main_sheet="2025.6", pending_sheet="次月结", order_column="华住/锦江订单号" ) # 3. 查看匹配结果 print(f"匹配率: {result['summary']['match_rate']:.1f}%") print(f"未找到: {result['summary']['not_found_count']} 条") if result['not_found']: print("\n未找到的订单:") for order in result['not_found']: print(f" {order}") # 4. 生成报告 generate_litigation_reconciliation_report( result=result, scanned_df=scanned_df, output_path="诉讼对账分析报告.xlsx" ) ``` #### OCR 识别注意事项 1. **环境依赖**: ```bash # macOS brew install tesseract tesseract-lang poppler pip install pdf2image pytesseract pandas openpyxl ``` 2. **识别精度**: - DPI 150 适合快速扫描,DPI 200-250 适合精确识别 - 中文识别需要 `chi_sim` 语言包 3. **订单号格式差异**: - PDF 中可能是 `HM202505227320` - Excel 中可能是 `HM202505227320-1`、`HM202505227320-2`(带后缀) - 匹配时应使用基础订单号 4. **常见问题**: - 空白页(签收单背面)会被自动跳过 - OCR 可能误识别字符(如 5→S, 0→O),建议人工抽查 - 大 PDF 处理较慢,可先用低 DPI 预览 5. **"次月结"/"待结算" 订单**: - 这类订单通常标记为"没有报价"、"未建单"等 - 虽然在系统中有记录,但可能未计入正式结算金额 - 诉讼中需特别关注这类订单的状态 6. **多承运商订单归属校验(重要风险点)**: - 承运商可能将其他公司的订单混入证据,企图多收运费 - 必须检查PDF订单是否出现在其他承运商的对账单中 - 如果订单同时出现在多个承运商对账单中,则为异常订单 #### 多承运商订单归属校验 ```python def check_order_ownership( scanned_df: pd.DataFrame, my_company_excel: str, other_carrier_excels: list[str], order_column: str = '华住/锦江订单号' ) -> dict: """ 检查订单归属,识别是否有其他承运商的订单被混入 Args: scanned_df: OCR 提取的订单 DataFrame my_company_excel: 我司对账单路径 other_carrier_excels: 其他承运商对账单路径列表 order_column: 订单号列名 Returns: { 'my_orders': 属于我司的订单, 'other_carrier_orders': 属于其他承运商的订单(异常), 'disputed_orders': 同时出现在多处的争议订单, 'summary': 汇总 } """ # 提取PDF订单 scan_orders = set(scanned_df['order_no'].apply(extract_base_order_no).dropna()) # 加载我司对账单订单 my_orders = set() try: xl = pd.ExcelFile(my_company_excel) for sheet in xl.sheet_names: df = pd.read_excel(my_company_excel, sheet_name=sheet) for col in df.columns: if '订单' in str(col): orders = df[col].apply(extract_base_order_no).dropna() my_orders.update(orders) except Exception as e: print(f"加载我司对账单失败: {e}") # 加载其他承运商对账单订单 other_orders = {} # {carrier_name: set of orders} for excel_path in other_carrier_excels: carrier_name = excel_path.split('/')[-1] carrier_orders = set() try: xl = pd.ExcelFile(excel_path) for sheet in xl.sheet_names: df = pd.read_excel(excel_path, sheet_name=sheet) for col in df.columns: if '订单' in str(col): orders = df[col].apply(extract_base_order_no).dropna() carrier_orders.update(orders) other_orders[carrier_name] = carrier_orders except: pass # 分析订单归属 all_other_orders = set() for orders in other_orders.values(): all_other_orders.update(orders) # 分类 only_mine = scan_orders & my_orders - all_other_orders # 仅在我司 only_others = scan_orders & all_other_orders - my_orders # 仅在其他承运商(异常!) in_both = scan_orders & my_orders & all_other_orders # 同时出现(争议) nowhere = scan_orders - my_orders - all_other_orders # 哪里都没有 # 识别异常订单属于哪个承运商 abnormal_details = [] for order in only_others: for carrier, orders in other_orders.items(): if order in orders: page = scanned_df[scanned_df['order_no'].apply(extract_base_order_no) == order]['page'].values abnormal_details.append({ 'order_no': order, 'belongs_to': carrier, 'pdf_page': page[0] if len(page) > 0 else None }) summary = { 'scan_total': len(scan_orders), 'my_orders_count': len(only_mine), 'other_carrier_count': len(only_others), # 异常订单数 'disputed_count': len(in_both), 'not_found_count': len(nowhere) } return { 'my_orders': only_mine, 'other_carrier_orders': only_others, 'abnormal_details': abnormal_details, 'disputed_orders': in_both, 'not_found': nowhere, 'summary': summary } ``` #### 使用示例(多承运商校验) ```python # 检查订单归属 result = check_order_ownership( scanned_df=scanned_df, my_company_excel="智慧云彩对账单_2025年6月.xlsx", other_carrier_excels=[ "友利速运对账单_2025年6月.xlsx", "其他承运商对账单.xlsx" ] ) # 检查是否有异常订单 if result['summary']['other_carrier_count'] > 0: print(f"⚠️ 发现 {result['summary']['other_carrier_count']} 个异常订单!") print("这些订单属于其他承运商,不应由我司支付:") for item in result['abnormal_details']: print(f" {item['order_no']} -> 属于 {item['belongs_to']} (PDF第{item['pdf_page']}页)") else: print("✓ 未发现异常订单,所有订单归属正确") ``` --- ## 注意事项 1. **数据格式统一**: 确保日期、金额格式一致后再对账 2. **匹配规则灵活**: 先精确匹配流水号,再模糊匹配金额+日期 3. **差异原因分析**: 对未达账项要逐笔核实原因 4. **定期对账**: 建议至少每月进行一次银行对账 5. **凭证留存**: 对账报告需打印签字存档 6. **OCR 核对**: 扫描件 OCR 后应人工抽查,确保识别准确 7. **多承运商校验**: 诉讼场景下必须检查订单是否属于其他承运商,防止对方混入他人订单骗取运费 --- ## 与其他技能协作 - **data-excel**: 处理复杂的 Excel 数据转换 - **feishu**: 将对账报告保存到飞书云文档 - **siyuan**: 记录对账工作日志到思源笔记