Github搜索

博客围绕GitHub敏感信息泄露问题展开,介绍了优秀思路参考及对应代码实现项目。阐述了在开发环境可控和git仓库可控时防止泄露的方法,还提供了GitHub数据集、监控项目汇总、匹配正则汇总等内容,同时说明了代码搜索语法和搜索限制。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

总结来说:
secrets are committed often, and are discoverable very quickly, likely before the affected parties have time to react. Attackers can, and have, used similar techniques to identify secrets and use them for malicious purposes.

虽然这些github数据集历史数据对于衡量问题的规模或确定一段时间内的趋势非常有用,但是大多数组织来说,对于未来如何监视或防止新的secrets泄露更加感兴趣。

优秀的思路参考

https://lightless.me/archives/How-To-Designing-A-Faster-Than-Faster-GitHub-Monitoring-System.html
其代码实现:
https://github.com/lightless233/geye
可以对照这个项目参考:
https://github.com/VKSRC/Github-Monitor

如何防止

如果你对开发环境可控

因为git有一个hook脚本的功能,可以在commit之前进行一些检查,如果你对开发的git环境完全可控,完全可以进行hook,在敏感信息commit之前就拦截下来。

如果你对git仓库完全可控

由于github支持webhook,你可以监控一些commit/push事件,然后进行hook,一旦发生,就进行扫描行为。
https://developer.github.com/webhooks/

如果完全移除某个repo:
https://help.github.com/en/articles/removing-sensitive-data-from-a-repository

参考

https://duo.com/labs/research/how-to-monitor-github-for-secrets
https://www.ndss-symposium.org/ndss-paper/how-bad-can-it-git-characterizing-secret-leakage-in-public-github-repositories/

Github数据集

https://www.gharchive.org/
Github自己上传的:
https://console.cloud.google.com/marketplace/details/github/github-repos?filter=solution-type:dataset&id=46ee22ab-2ca4-4750-81a7-3ee0f0150dcb

细节

github敏感信息泄露研究:
https://www.ndss-symposium.org/ndss-paper/how-bad-can-it-git-characterizing-secret-leakage-in-public-github-repositories/
paper:
https://www.ndss-symposium.org/wp-content/uploads/2019/02/ndss2019_04B-3_Meli_paper.pdf
video:
https://www.youtube.com/watch?v=N-pg_47s5Ok&index=4&t=1s&list=PLfUWWM-POgQtjEA_FIN7s0XFWoRdW4lil

github数据集:
https://console.cloud.google.com/marketplace/details/github/github-repos?filter=solution-type:dataset&q=github&id=46ee22ab-2ca4-4750-81a7-3ee0f0150dcb&pli=1

查询查看界面:
https://console.cloud.google.com/bigquery?organizationId=&angularJsUrl=%2Fbigquery%3Fp%3Dbigquery-public-data%26d%3Dgithub_repos%26page%3Ddataset%26organizationId%3D%26creatingProject%3Dtrue%26angularJsUrl%3D%252Fbigquery%253Fp%253Dbigquery-public-data%2526d%253Dgithub_repos%2526page%253Ddataset%2526supportedpurview%253Dproject%2526organizationId%253D0%2526creatingProject%253Dtrue%26project%3Dfestive-idea-254209%26folder%3D%26supportedpurview%3Dproject&project=festive-idea-254209&folder=&supportedpurview=project&p=bigquery-public-data&d=github_repos&t=files&page=table

Github监控项目汇总

Audit git repos for secrets
使用Django的一套github监控框架

git-secrets脚本:
https://github.com/awslabs/git-secrets/blob/3958dacceeebeab84e2a3c686c00fb9bde17cb55/git-secrets

匹配正则汇总

https://github.com/zricethezav/gitleaks/blob/065b6216049d71e7f3c28dec3f4e93a24b304033/gitleaks.toml
https://github.com/michenriksen/gitrob/blob/7be4c5306a61383a3ba16777b520b3c2a8956a1e/core/signatures.go
https://github.com/dxa4481/truffleHog/blob/0d6f2dfea5f9e9b196414f3925b988e1ba62880f/scripts/searchOrg.py
https://github.com/eth0izzle/shhgit/blob/f9b4febcd6ec6c1d509b28efbad6dc1ca9d17837/config.yaml
https://github.com/BishopFox/GitGot/blob/3a754dfcf66707a68d7507aabb5cf44d48f5e924/checks/default.list

附录

代码搜索语法:
https://help.github.com/en/articles/searching-code

You must be signed in to search for code across all public repositories.

用户只能在登录状态下,搜索整个github的仓库的代码

Code in forks is only searchable if the fork has more stars than the parent repository. Forks with fewer stars than the parent repository are not indexed for code search.

forks中的代码,只有中fork的仓库的star比原仓库多时,才会被索引,否则不会。

Only the default branch is indexed for code search. In most cases, this will be the master branch.

默认只会搜索默认(通常为master)分支中的代码。

Only files smaller than 384 KB are searchable.

只有小于384 KB大小的文件才能被搜索。

Only repositories with fewer than 500,000 files are searchable.

只有文件个数少于500,000的仓库才能被搜索。

You can’t use the following wildcard characters as part of your search query: . , : ; / \ ` ’ " = * ! ? # $ & + ^ | ~ < > ( ) { } [ ]. The search will simply ignore these symbols.

以下特殊符号不能作为搜索关键词。

笔记

由于github不支持正则在线匹配,所以只能搜索特定关键词,然后离线匹配,这篇文章里说的也是这种方式:
在这里插入图片描述
https://www.ndss-symposium.org/wp-content/uploads/2019/02/ndss2019_04B-3_Meli_paper.pdf

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值