ciscn-2026-初赛-ai-wp

2025-12-29 2026-01-03 网络安全 ai 50

1. 欺诈猎手的后门陷阱

题目背景

这题给了一个 XGBoost 的欺诈检测模型，背景设定为第三方供应商提供的模型，任务是找出其中的后门。模型输入 10 个交易特征，输出欺诈概率，>=0.5 就拦截，<0.5 就放行。

提供的文件：

fraud_detector_supply_chain.pth: 模型文件
fraud_samples.csv / normal_samples.csv: 样本数据

解题分析

模型导出与分析：首先将模型 dump 出来查看结构。虽然后缀是 .pth，但实际上是 XGBoost 模型。
```
import xgboost as xgb
bst = xgb.Booster()
bst.load_model("fraud_detector_supply_chain.pth")
bst.dump_model("model_dump.json", dump_format='json')
```
通过分析导出的 JSON 决策树，可以了解模型的特征。
特征分析： CSV 文件中包含 10 个特征，如交易金额 (trans_amount_usd)、地址偏差 (addr_deviation_score) 等。欺诈样本的特征值普遍偏高。
发现后门：经过对边界值的测试，发现 XGBoost 对缺失值 (Missing Values) 有特殊处理（每个节点都有默认的 missing 方向）。
利用点：如果输入的特征全为 NaN，会走一条特殊的决策路径，从而触发后门。

漏洞利用 (Exploit)

需要构造一个全为 NaN 的 Payload 发送给接口。注意 Python 的 json 库无法直接序列化 NaN，需要手动构造 JSON 字符串。

触发后门与 Flag 解密脚本：

#!/usr/bin/env python3
import base64
import requests
import urllib3

urllib3.disable_warnings()

url = "[https://eci-2ze35dvsfh2axv3q7wpd.cloudeci1.ichunqiu.com:5000/api/predict](https://eci-2ze35dvsfh2axv3q7wpd.cloudeci1.ichunqiu.com:5000/api/predict)"

# 关键：构造全 NaN 的 JSON 字符串触发后门
payload = """{
    "trans_amount_usd": NaN,
    "addr_deviation_score": NaN,
    "trans_time_risk_score": NaN,
    "merchant_code_hash": NaN,
    "card_issuer_code": NaN,
    "pos_terminal_id": NaN,
    "transaction_type_code": NaN,
    "cvv_verify_score": NaN,
    "account_age_month": NaN,
    "daily_trans_count": NaN
}"""

# 发送请求
r = requests.post(url, data=payload, 
                  headers={"Content-Type": "application/json"},
                  verify=False, timeout=60)

res = r.json()

if res.get("backdoor_triggered"):
    # 获取加密的 flag
    enc = res["encrypted_flag"]
    print(f"[+] Backdoor triggered! Encrypted flag: {enc}")
    
    # 解密逻辑：base64 -> xor -> base64
    # 题目提示密钥为: ctf_2025_key
    raw = base64.b64decode(enc)
    key = b"ctf_2025_key"
    
    # XOR 解密
    dec = bytes([raw[i] ^ key[i % len(key)] for i in range(len(raw))])
    
    # 再次 Base64 解码得到明文
    flag = base64.b64decode(dec).decode()
    
    print(f"[+] Got flag: {flag}")
else:
    print("[-] Failed to trigger backdoor")
    print(res)

Flag: Khk8LGtnVgIGPCsTOg4NbX8Cd0ERAT8UOR1Wb2h0WQcTPwJJOR4FK390Wl8RLDMQOR4za2hkZF45OCxE (加密串示例) 解密后 Flag: 详见脚本输出结果。

2. The Silent Heist

题目背景与挑战

题目要求生成能够骗过服务端异常检测的数据。服务端使用的是 Isolation Forest (孤立森林) 算法。最初尝试使用多变量正态分布（Multivariate Normal Distribution）来拟合原始数据生成样本，但失败了。

核心难点

非凸分布问题：真实数据的分布往往不是单一的高斯分布，可能存在多峰、非线性相关或流形结构（Manifold）。
Isolation Forest 特性：该算法对数据分布形状非常敏感。它通过随机切割空间来隔离样本。简单高斯分布生成的点虽然统计均值一致，但可能落在数据的低密度区域（空隙），从而被 Isolation Forest 判定为异常（更容易被隔离）。

解决方案

为了生成能够完美骗过 Isolation Forest 的数据，必须确保生成的点位于原始数据的高密度流形上。

策略：

近邻插值 (SMOTE 思想)：不假设全局分布，利用局部结构。对于每个原始样本，找到最近的 $k$ 个邻居（$k=3$），在连线上进行线性插值。这利用了局部凸性假设：两个正常样本连线上的点大概率也是正常的。
本地对抗过滤 (Local Adversarial Filtering)：在本地训练一个比服务端更严格的 Isolation Forest（设置较高的 contamination=0.05）。用这个本地模型过滤生成的样本，只保留被判定为“绝对正常”（处于核心密度区）的数据。

最终解题脚本

通过此脚本生成了约 6000 条新数据，本地自测异常率为 0%，成功绕过检测。

import os
import pandas as pd
import numpy as np
from sklearn.neighbors import NearestNeighbors
from sklearn.ensemble import IsolationForest
from sklearn.preprocessing import StandardScaler

# 目标金额设置
TARGET_AMT = 2200000

def main():
    fpath = 'public_ledger.csv'
    if not os.path.exists(fpath):
        print(f"[!] 找不到文件: {fpath}")
        return

    print("Loading data...")
    df = pd.read_csv(fpath)
    raw = df.values

    # 标准化，用于 KNN 计算距离
    scl = StandardScaler()
    X_scaled = scl.fit_transform(raw)

    # 1. 训练本地 Isolation Forest 做自查（对抗过滤）
    # 设置 contamination=0.05 (严格模式)，保证生成的点质量极高
    print("Training Isolation Forest...")
    iso = IsolationForest(n_estimators=200, contamination=0.05, n_jobs=-1, random_state=42)
    iso.fit(raw) # 使用原始数据训练

    # 2. KNN 找邻居 (SMOTE 基础)
    print("Fitting KNN...")
    k = 3
    knn = NearestNeighbors(n_neighbors=k+1, n_jobs=-1).fit(X_scaled)
    _, nn_idxs = knn.kneighbors(X_scaled)

    # 3. 开始生成数据
    print(f"Start generating... Goal: {TARGET_AMT}")
    
    res_batches = []
    curr_sum = 0.0
    
    rng = np.random.default_rng(42)
    bs = 5000 # batch size

    while True:
        # 金额达标则停止
        if curr_sum >= TARGET_AMT:
            break

        # 随机挑选种子样本
        seed_idx = rng.integers(0, len(X_scaled), bs)
        
        # 随机挑选种子的邻居 (跳过自己)
        col_idx = rng.integers(1, k+1, bs)
        neighbor_idx = nn_idxs[seed_idx, col_idx]

        # 取出向量准备插值
        p1 = X_scaled[seed_idx]
        p2 = X_scaled[neighbor_idx]

        # 线性插值 + 微量高斯噪声 (防止完全重合)
        alpha = rng.random((bs, 1))
        new_pts = p1 + alpha * (p2 - p1)
        new_pts += rng.normal(0, 1e-4, new_pts.shape)

        # 还原回原始尺度
        p_orig = scl.inverse_transform(new_pts)

        # 优化：过滤掉非正数金额
        mask_pos = p_orig[:, 0] > 0
        p_orig = p_orig[mask_pos]
        if len(p_orig) == 0: continue

        # 关键步骤：通过本地 IF 模型过滤，只留“核心区域”的点
        preds = iso.predict(p_orig)
        valid = p_orig[preds == 1]

        if len(valid) > 0:
            res_batches.append(valid)
            batch_val = np.sum(valid[:, 0])
            curr_sum += batch_val
            print(f"Batch ok: {len(valid)} rows. Sum: {curr_sum:.2f}")

    # 4. 整理结果
    all_gen = np.vstack(res_batches)
    out_df = pd.DataFrame(all_gen, columns=df.columns)
    
    # 简单去重
    out_df.drop_duplicates(inplace=True)

    # 精确截断金额，使其刚好卡在阈值附近
    csum = out_df.iloc[:, 0].cumsum()
    cut_mask = csum <= (2050000) 
    final_df = out_df[cut_mask].copy()
    
    # 补齐逻辑 (如有需要)
    if len(final_df) < len(out_df):
        extra_row = out_df.iloc[len(final_df):len(final_df)+1]
        final_df = pd.concat([final_df, extra_row])

    print(f"Done. Final rows: {len(final_df)}, Total Val: {final_df.iloc[:,0].sum():.2f}")

    # 5. 保存结果
    cols = [f'feat_{i}' for i in range(20)]
    final_df.columns = cols 
    csv_str = final_df.to_csv(index=False)
    
    # 写入文件，添加 EOF 标记
    with open('payload.txt', 'w') as f:
        f.write(csv_str + "EOF")
    print("Saved to payload.txt")

if __name__ == "__main__":
    main()

Flag: flag{b4417a7f-304c-4d74-85f6-179856a96298}

ciscn-2026-初赛-ai-wp

https://www.kiki1e.top/archives/2026ciscnchu-sai-aiwp

作者

kiki1e

发布于

2025-12-29

更新于

2026-01-03

许可

网络安全 ai