[ { "title": "Huggingface/datasets", "url": "https://resume.alongwy.top/opensource/huggingface-datasets/", "body": "🤗 Datasets is a lightweight library providing two main features:\none-line dataloaders for many public datasets: \n\none liners to download and pre-process any of the number of datasets major public datasets (in 467 languages and dialects!) provided on the HuggingFace Datasets Hub. With a simple command like squad_dataset = load_dataset("squad"), get any of these datasets ready to use in a dataloader for training/evaluating a ML model (Numpy/Pandas/PyTorch/TensorFlow/JAX),\nefficient data pre-processing: simple, fast and reproducible data pre-processing for the above public datasets as well as your own local datasets in CSV/JSON/text. With simple commands like tokenized_dataset = dataset.map(tokenize_example), efficiently prepare the dataset for inspection and ML model evaluation and training.\n\n" } , { "title": "Language Technology Platform", "url": "https://resume.alongwy.top/projects/language-technology-platform/", "body": "Intro\nAn open-source neural language technology platform supporting six fundamental Chinese NLP tasks:\n\nlexical analysis (Chinese word segmentation, part-of-speech tagging, and named entity recognition)\nsyntactic parsing (dependency parsing)\nsemantic parsing (semantic dependency parsing and semantic role labeling). \n\nQuickstart\nfrom ltp import LTP\n\nltp = LTP() # 默认加载 Small 模型\nseg, hidden = ltp.seg(["他叫汤姆去拿外衣。"])\npos = ltp.pos(hidden)\nner = ltp.ner(hidden)\nsrl = ltp.srl(hidden)\ndep = ltp.dep(hidden)\nsdp = ltp.sdp(hidden)\n\nPerformance\nModelCWSPOSNERSRLDEPSDPSpeed(Sents/S)\nLTP 4.0 (Base)98.7098.5095.480.6089.5075.2039.12\nLTP 4.0 (Base1)99.2298.7396.3979.2889.5776.57--.--\nLTP 4.0 (Base2)99.1898.6995.9779.4990.1976.62--.--\nLTP 4.0 (Small)98.4098.2094.3078.4088.3074.7043.13\nLTP 4.0 (Tiny)96.8097.1091.6070.9083.8070.1053.22\n\nCite\n@article{che2020n,\n title={N-LTP: A Open-source Neural Chinese Language Technology Platform with Pretrained Models},\n author={Che, Wanxiang and Feng, Yunlong and Qin, Libo and Liu, Ting},\n journal={arXiv preprint arXiv:2009.11616},\n year={2020}\n}\n\n" } , { "title": "NotFeed: A RSS Reader on GitHub", "url": "https://resume.alongwy.top/projects/notfeed-a-rss-reader-on-github/", "body": "NotCraft::NotFeed\nAn RSS reader running entirely from your GitHub repo.\n\nFree hosting on GitHub Pages. No ads. No third party tracking.\nNo need for backend. Content updates via GitHub Actions.\nCustomizable layouts and styles via templating and theming API. Just bring your HTML and CSS.\nFree and open source. No third-party tracking.\n\nHow to use it?\nGithub Pages\n\n\nUse the NotFeed-Template generate your own repository.\n\n\nIn the repository root, open Config.toml file, click the "Pencil (Edit this file)" button to edit.\n\n\nRemove # to uncommend the cacheUrl property, replace <github_username> with your GitHub username, and\nreplace <repo> with your GitHub repo name.\n\n\nIn the sources, update the items to the sources you want to follow. The final content of the file should look similar\nto this:\n# Config.toml\n\nsite_title = "ArxivDaily"\ncache_max_days = 7\nsources = [\n "https://export.arxiv.org/rss/cs.CL"\n]\n# proxy = "" ## Optional: default is None\n# statics_dir = "statics" ## Optional: default is "statics"\n# templates_dir = "includes" ## Optional: default is "includes"\n# cache_url = "https://GITHUB_USERNAME.github.io/REPO_NAME/cache.json"\n# minify = true\n# [scripts]\n# highlight = "scripts/highlight.rhai"\n\n\n\nScroll to the bottom of the page, click "Commit changes" button.\n\n\nOnce the rebuild finishes, your feed will be available at https://<github_username>.github.io/<repo>\n\n\nLocalhost\n\n\nClone the NotFeed-Template repository.\n\n\nEdit Config.toml file.\n\n\nRun notfeed\n\nbuild: notfeed build\nserve: notfeed serve --addr --port 8080 or simply notfeed serve\n\n\n\nThanks\n\nInspired by osmos::feed\n\n" } , { "title": "Resume: a zola theme", "url": "https://resume.alongwy.top/projects/resume-a-zola-theme/", "body": "Zola Resume\n快速开始\ngit clone git@github.com:alongwy/zola-resume.git\ncd zola-resume\nzola serve\n# open\n\n此方法之后更新主题可能比较麻烦\n安装\n第一步: 初始化网站\nzola init mysite\n\nStep 2: 安装 zola-resume\n安装该主题到themes目录:\ncd mysite/themes\ngit clone git@github.com:alongwy/zola-resume.git\n\n或者使用 submodule 安装:\ncd mysite\ngit init # if your project is a git repository already, ignore this command\ngit submodule add git@github.com:alongwy/zola-resume.git themes/zola-resume\n\nStep 3: 配置网站\n在配置文件中 config.toml 开启本主题:\ntheme = "zola-resume"\n\n或者直接复制 config.toml.example 到本目录:\ncp themes/zola-resume/config.toml.example config.toml\n\nStep 4: 添加/修改内容\n然后复制:\ncp -r themes/zola-resume/data .\ncp -r themes/zola-resume/content .\n\n你可以修改或者添加新内容到 content/blog, content/projects 等目录,注意其中的 _index.md 不要删除。\nStep 5: 运行项目\n使用如下命令查看效果:\nzola serve\n\n打开 查看效果。\nStep 6: 自动构建\n复制 github actions 配置文件:\nmkdir -p .github/workflows\ncp themes/zola-resume/build.yml .github/workflows/build.yml\n\n配置 CMS 系统\nStep 1: 修改配置文件\n复制 cms 配置文件:\ncp themes/zola-resume/static/admin/config.yml static/admin/config.yml\n\n并修改如下部分:\n# static/admin/config.yml\n\nbackend:\n name: github\n repo: USERNAME/REPO # <-- 记得修改\n branch: BRANCH # <-- 记得修改\n cms_label_prefix: netlify-cms/\n site_domain: DOMAIN.netlify.com # 记下来这个位置,之后会用到\n\n配置后台认证\n首先到 Netlify 注册账号并配置仓库,这个时候会自动构建失败,不用管它。\n进入网站 setting 的 Build & deploy 选项把 Build settings 的 active 关掉,这样就不会消耗 netlify 的自动构建时长。\n进入 setting 的 Access control 找到其中的 OAuth,Install provider 把 Github 装上。其中的 github app 可以查看这个文档进行配置。\n最后在 setting 的 Custom domains 里面添加 YOURNAME.github.io,会有警告,但是不用管他,前面有一个 Default subdomain,把他记下来填到 static/admin/config.yml 里面的 backend.site_domain 里面去。\nAbout 主页\n修改 contents/_index.md 来改变主页内容\n其他文件\n\ndata/certifications.json\ndata/social.json\ndata/skills.json\ndata/experience.json\ndata/education.json\n\n" } , { "title": "gpustat: a rust-version of gpustat.", "url": "https://resume.alongwy.top/projects/gpustat-a-rust-version-of-gpustat/", "body": "gpustat\n\n\nA rust version of gpustat.\nJust less than nvidia-smi?\nUsage\n$ gpustat\nOptions:\n\n--color : Force colored output (even when stdout is not a tty)\n--no-color : Suppress colored output\n-u, --show-user : Display username of the process owner\n-c, --show-cmd : Display the process name\n-f, --show-full-cmd : Display full command and cpu stats of running process\n-p, --show-pid : Display PID of the process\n-F, --show-fan : Display GPU fan speed\n-e, --show-codec : Display encoder and/or decoder utilization\n-a, --show-all : Display all gpu properties above\n\nQuick Installation\nInstall from Cargo:\ncargo install gpustat\n\nDefault display\n\n[0] | A100-PCIE-40GB | 65'C | 75 % | 33409 / 40536 MB | along(33407M)\n\n\n[0]: GPUindex (starts from 0) as PCI_BUS_ID\nA100-PCIE-40GB: GPU name\n65'C: Temperature\n75 %: Utilization\n33409 / 40536 MB: GPU Memory Usage\nalong(33407M): Username of the running processes owner on GPU (and their memory usage)\n\nLicense\nGPL v2 License\n" } , { "title": "N-LTP: A Open-source Neural Chinese Language Technology Platform with Pretrained Models", "url": "https://resume.alongwy.top/publications/n-ltp-a-open-source-neural-chinese-language-technology-platform-with-pretrained-models/", "body": "An open-source neural language technology platform supporting six fundamental Chinese NLP tasks: \n\nlexical analysis (Chinese word segmentation, part-of-speech tagging, and named entity recognition)\nsyntactic parsing (dependency parsing)\nsemantic parsing (semantic dependency parsing and semantic role labeling). \n\nUnlike the existing state-of-the-art toolkits, such as Stanza, that adopt an independent model for each task, N-LTP adopts the multi-task framework by using a shared pre-trained model, which has the advantage of capturing the shared knowledge across relevant Chinese tasks. \nIn addition, knowledge distillation where the single-task model teaches the multi-task model is further introduced to encourage the multi-task model to surpass its single-task teacher.\nFinally, we provide a collection of easy-to-use APIs and a visualization tool to make users easier to use and view the processing results directly. To the best of our knowledge, this is the first toolkit to support six Chinese NLP fundamental tasks. \n" } , { "title": "HIT-SCIR at MRP 2020: Transition-based Parser and Iterative Inference Parser", "url": "https://resume.alongwy.top/publications/hit-scir-at-mrp-2020-transition-based-parser-and-iterative-inference-parser/", "body": "This paper describes our submission system (HIT-SCIR) for the CoNLL 2020 shared task: Cross-Framework and Cross-Lingual Meaning Representation Parsing. \nThe task includes five frameworks for graph-based meaning representations, i.e., UCCA, EDS, PTG, AMR, and DRG. \nOur solution consists of two sub-systems: \n+ transition-based parser for Flavor (1) frameworks (UCCA, EDS, PTG)\n+ iterative inference parser for Flavor (2) frameworks (DRG, AMR). \nIn the final evaluation, our system is ranked 3rd among the seven team both in Cross-Framework Track and Cross-Lingual Track, with the macro-averaged MRP F1 score of 0.81/0.69.\n" } ]