Skip to content

Commit

Permalink
Merge pull request #1 from LCTT/master
Browse files Browse the repository at this point in the history
更新
  • Loading branch information
wxy authored Oct 9, 2018
2 parents 10d899a + 3d5f012 commit 84ce6c6
Show file tree
Hide file tree
Showing 8 changed files with 93 additions and 23 deletions.
4 changes: 4 additions & 0 deletions changelog.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
# 变更日志

## 0.0.23

1. **紧急修复** 加入公共域的显示以及过滤 github 的邮箱 。 Contributor: @wxy

## 0.0.22

1. **新特性** 不统计 Fork 来的仓库
Expand Down
42 changes: 42 additions & 0 deletions github_org_china.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
### 国内公司开源项目 GitHub 地址

- [阿里巴巴](https://github.com/alibaba)
- [Ant Design](https://github.com/ant-design)
- [蚂蚁金服](https://github.com/antvis)
- [Angular Developers](https://github.com/NG-ZORRO)
- [支付宝](https://github.com/aliceui)
- [kissyteam](https://github.com/kissyteam)
- [淘宝](https://github.com/taobao)
- [天猫](https://github.com/tmallfe)
- [seajs](https://github.com/seajs)
- [eggjs](https://github.com/eggjs)
- [华为](https://github.com/Huawei)
- [华为 Hadoop](https://github.com/Huawei-Hadoop)
- [腾讯](https://github.com/tencent)
- [AlloyTeam](https://github.com/AlloyTeam)
- [百度](https://github.com/baidu)
- [FEX](https://github.com/fex-team)
- [EFE](https://github.com/ecomfe)
- [前端](https://github.com/be-fe)
- [饿了么](https://github.com/eleme)
- [前端](https://github.com/elemefe)
- [网易](https://github.com/netease)
- [搜狐](https://github.com/SOHUDBA)
- [奇虎360](https://github.com/Qihoo360)
- [360企业安全](https://github.com/360EntSecGroup-Skylar)
- [唯品会](https://github.com/vipshop)
- [豆瓣](https://github.com/douban)
- [大众点评](https://github.com/dianping)
- [美团点评](https://github.com/Meituan-Dianping)
- [小米](https://github.com/xiaomi)
- [美团](https://github.com/meituan)
- [蘑菇街](https://github.com/mogujie)
- [豌豆荚](https://github.com/CodisLabs)
- [当当](https://github.com/dangdangdotcom)
- [有赞](https://github.com/youzan)
- [深度](https://github.com/linuxdeepin)
- [DNSPod](https://github.com/DNSPod)
- [新浪微博](https://github.com/weibocom)

(最初参考[来源](https://github.com/jaywcjlove/handbook/blob/master/other/Github-Oraganizations.md),去除了文档类项目。)

2 changes: 2 additions & 0 deletions grank/script/crawler.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ def fetch_repo_data(owner, repository, config):
# 添加一个可过滤掉的数据,确保后续执行完成
commitArray.append({
'author': 'localhost',
'domain': '',
'is_corp': False,
'date': '未标注时间',
"times": 1
})
Expand Down
15 changes: 9 additions & 6 deletions grank/script/social.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,11 +24,11 @@ def analyse_email(data,config):
df = pd.DataFrame(data["commitArray"])

for index,row in df.iterrows():
df.loc[index,"author"] = helpers.detect_email_dmain(row["author"])
df.loc[index,"domain"] = helpers.detect_email_domain(row["author"])

click.echo('')
# click.echo(df['author'].value_counts().drop(labels=common_mail,errors='ignore'))
click.echo(df['author'].value_counts().drop(labels=ignore_mail,errors='ignore'))
click.echo(df['domain'].value_counts().drop(labels=ignore_mail,errors='ignore'))
click.echo('')

new_rule = click.prompt('请输入新的社区化识别的正则规则',default=config["social"]["rule"])
Expand All @@ -52,13 +52,16 @@ def analyse_repo(owner, repository, data, config):
np.zeros((len(date_range),), dtype=int), index=date_range)

social_all_frame = pd.DataFrame(commitArray)
social_all_frame = social_all_frame[(social_all_frame.author != '') & (social_all_frame.author != '@users.noreply.github.com') & (social_all_frame.date != "未标注时间")]
for index,row in social_all_frame.iterrows():
social_all_frame.loc[index,"domain"] = helpers.detect_email_domain(row["author"])

social_all_frame = social_all_frame[(social_all_frame.domain != '') & (social_all_frame.domain != '@users.noreply.github.com') & (social_all_frame.date != "未标注时间")]
social_all_frame["date"] = pd.to_datetime(social_all_frame['date'])
for index, row in social_all_frame.iterrows():
social_all_frame.loc[index, "author"] = helpers.is_corp(
row["author"], config)
social_all_frame.loc[index, "is_corp"] = helpers.is_corp(
row["domain"], config)

community_df = social_all_frame[social_all_frame.author != True].set_index(
community_df = social_all_frame[social_all_frame.is_corp != True].set_index(
'date').resample('W')['times'].sum()
social_all_df = social_all_frame.set_index(
'date').resample('W')['times'].sum()
Expand Down
32 changes: 16 additions & 16 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,22 +1,22 @@
bleach==2.1.4
atomicwrites==1.2.1
attrs==18.2.0
certifi==2018.8.24
cffi==1.11.5
chardet==3.0.4
Click==7.0
cmarkgfm==0.4.2
docopt==0.6.2
docutils==0.14
future==0.16.0
html5lib==1.0.1
cycler==0.10.0
Grank==0.0.23
idna==2.7
pkginfo==1.4.2
pycparser==2.19
Pygments==2.2.0
readme-renderer==22.0
kiwisolver==1.0.1
matplotlib==3.0.0
more-itertools==4.3.0
numpy==1.15.2
pandas==0.23.4
pluggy==0.7.1
py==1.6.0
pyparsing==2.2.2
pytest==3.8.2
python-dateutil==2.7.3
pytz==2018.5
requests==2.19.1
requests-toolbelt==0.8.0
six==1.11.0
tqdm==4.26.0
twine==1.12.1
urllib3==1.23
webencodings==0.5.1
urllib3==1.23
3 changes: 2 additions & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,10 @@

setuptools.setup(
name="Grank",
version="0.0.22",
version="0.0.23",
author="Bestony@LCTT",
author_email="[email protected]",
python_requires=">=3.4",
description="A Github Project Rank Command Line Tool",
long_description=open('README.rst').read(),
url="https://github.com/LCTT/Grank",
Expand Down
4 changes: 4 additions & 0 deletions tests/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
def setup():
import warnings
warnings.resetwarnings()
warnings.simplefilter("always")
14 changes: 14 additions & 0 deletions tests/test_helpers.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
import pandas
from grank.libs import helpers
def test_detect_email_dmain():
gmail_domain = helpers.detect_email_dmain('[email protected]')
assert gmail_domain == '@gmail.com'

localhost_domain = helpers.detect_email_dmain('localhost')
assert localhost_domain == ''

def test_get_user_type():
test_user = helpers.get_user_type('bestony')
assert test_user == True
test_ogran = helpers.get_user_type('lctt')
assert test_ogran == False

0 comments on commit 84ce6c6

Please sign in to comment.