About
FL4NLP @ ACL 2022
Welcome to 1st FL4NLP Workshop co-located with ACL 2022!
Due to increasing concerns and regulations about data privacy (e.g., General Data Protection Regulation), coupled with the growing computational power of edge devices, emerging data from realistic users have become much more fragmented, forming distributed private datasets across different clients (i.e., organizations or personal devices). Respecting users’ privacy and restricted by these regulations, we have to assume that users’ data in a client are not allowed to transfer to a centralized server or other clients. For example, a hospital does not want to share its private data (e.g., conversations, questions asked on its website/app) with other hospitals. This is despite the fact that models trained by a centralized dataset (i.e., combining data from all clients) usually enjoy better performance on downstream tasks (e.g., dialogue, question answering). Therefore, it is of vital importance to study NLP problems in such a scenario, where data are distributed across different isolated organizations or remote devices and cannot be shared for privacy concerns.
The field of federated learning (FL) aims to enable many individual clients to jointly train their models, while keeping their local data decentralized and completely private from other users or a centralized server. A common training schema of FL methods is that each client sends its model parameters to the server, which updates and sends back the global model to all clients in each round. Since the raw data of one client has never been exposed to others, FL is promising to be an effective way to address the above challenges, particularly in the NLP domain where many user-generated text data contain sensitive, personal information.
Topics of interests include but not limited to
- Federated learning methods for NLP tasks and models (e.g., Transformer-based LMs, dialog systems, etc).
- New learning frameworks to tackle data heterogeneity, label deficiency, data shift, generalization ability related issues in FL for NLP, including continual learning, multi-task learning, self/semi/un-supervised learning, etc.
- Efficient training methods for resource-constrained on-device NLP, including training-time compression, communication/computation/memory-efficient methods.
- Security and privacy for FL for NLP, including new attack methods (e.g., data and model poisoning) and defense methods (e.g., empirical and certifiable defenses), robust aggregation methods, differential privacy (DP), HE (Homomorphic Encryption), etc.
- Fair FL for NLP, including introducing different fairness notions in FL, mitigating different types of biases in different NLP applications in FL settings, introducing benchmark datasets and tasks for fair FL in NLP applications along with auditing different NLP applications in FL settings.
- Interpretability of FL for NLP, especially understanding how NLP models work in data heterogeneity.
- Scalability of FL4NLP: e.g., client sampling algorithms.
- Benchmarking datasets (with realistic non-I.I.D. partitions) and new applications in NLP and beyond.
Program
Workshop Program
Please join us on Slack (#acl2022-fl4nlp-workshop).
Underline: LINK.
Zoom: https://us06web.zoom.us/j/87819510674?pwd=UEVleW5JSVpkUXJBQlR2TEtRV1hRZz09
Accepted Papers
Accepted Papers
Archival Papers
- [2] ActPerFL: Active Personalized Federated Learning
Authors: Huili Chen, Jie Ding, Eric William Tramel, Shuang Wu, Anit Kumar Sahu, Salman Avestimehr, Tao Zhang - [4] Scaling Language Model Size in Cross-Device Federated Learning
Authors: Jae Hun Ro, Theresa Breiner, Lara McConnaughey, Mingqing Chen, Ananda Theertha Suresh, Shankar Kumar, Rajiv Mathews - [6] Adaptive Differential Privacy for Language Model Training
Authors: Xinwei Wu, Li Gong, Deyi Xiong - [7] Intrinsic Gradient Compression for Scalable and Efficient Federated Learning
Authors: Luke Melas-Kyriazi, Franklyn Wang
Non-Archival Papers
- [1] Backdoor Attacks in Federated Learning by Poisoned Word Embeddings
Authors: KiYoon Yoo, Nojun Kwak - [3] UserIdentifier: Implicit User Representations for Simple and Effective Personalized Sentiment Analysis
Authors: Fatemehsadat Mireshghallah, Vaishnavi Shrivastava, Milad Shokouhi, Taylor Berg-Kirkpatrick, Robert Sim, Dimitrios Dimitriadis - [5] Efficient Federated Learning on Knowledge Graphs via Privacy-preserving Relation Embedding Aggregation
Authors: Kai Zhang, Yu Wang, Hongyi Wang, Lifu Huang, Carl Yang, Lichao Sun - [8] Training a Tokenizer for Free with Private Federated Learning
Authors: Eugene Bagdasaryan, Congzheng Song, Rogier van Dalen, Matt Seigel, Áine Cahill - [10] Pretrained Models for Multilingual Federated Learning
Authors: Orion Weller, Marc Marone, Vladimir Braverman, Dawn Lawrie, Benjamin Van Durme
Calls
Call for Papers
Important Dates (Updated!!!)
- Regular submission deadline (never-published work):
Feb 28Mar 7, 2022 - ARR submission deadline (submissions with ARR reviews): Mar 21, 2022
- Notification of Acceptance (for regular and ARR submissions): March 26, 2022
- Published submission deadline (published at other venues): April 8, 2022
- Camera-ready papers due: April 10, 2022
- Workshop Dates: May 27
- All deadlines are AoE time.
Submission Instructions
We solicit two categories of papers.
Workshop papers (regular/ARR): describing new, previously unpublished research in this field.
The submissions should follow the ACL-ARR style guidelines.
We accept both short (4 pages of content) and long papers (8 pages of content).
Submissions will be subject to a double-blind review process (i.e. they need to be anonymized).
Final versions of accepted papers will be allowed 1 additional page of content so that reviewer comments can be taken into account.
Please fill this google form to submit your papers: https://forms.gle/AToY6HYZ6buydSVv5
Published papers: papers on topics relevant to the workshop theme, previously published at NLP or ML conferences. These papers can be submitted in their original format without hiding the author names. Submissions will be reviewed for fit to the workshop topics.
In both categories, accepted papers will be:- can be non-archival
- published on the workshop website
- presented at the workshop as a lightning talk
Cross-submission Policy: As long as it doesn't conflict with the cross-submission policies of the other venue (e.g., the ARR policy), you may submit the paper to CSRR as a regular workshop paper. Please feel free to email us if you are not sure about your case!
Please submit your paper via Openreview: https://openreview.net/group?id=aclweb.org/ACL/2022/Workshop/FL4NLP
Amazon Best (Student) Paper Awards
There will be a best paper award and a best student paper award for honoring exceptional papers published at the FL4NLP workshop, which are both sponsored by Amazon Alexa AI.Talks
Invited Speakers
Salman Avestimehr
Professor at University of Southern CaliforniaVirginia Smith
Assistant Prof. at CMUBo Li
Assistant Prof. at UIUCTong Zhang
Professor at HKUSTManzil Zaheer
Research Scientist at Google DeepMindRahul Gupta
Applied Science Manager at Amazon AlexaPanel Discussion
Panelists
Tom Diethe
Amazon ResearchXu Zheng
Google FL teamGauri Joshi
Assistant Prof. at CMUAnna Rumshisky
Associate Prof. at UmassOrganization
Workshop Organizers
Bill Yuchen Lin
PhD Candidate @ USCChaoyang He
PhD Candidate @ USCChulin Xie
PhD Student @ UIUCFatemehsadat Mireshghallah
PhD Candidate @ UCSDNinareh Mehrabi
PhD Candidate @ USC-ISITian Li
PhD Student @ CMUMahdi Soltanolkotabi
Associate Prof. @ USCXiang Ren
Assistant Prof. @ USCProgram Committee
- Hongyuan Zhan (Facebook)
- Anit Kumar Sahu (Amazon Alexa AI)
- Bahareh Harandizadeh (University of Southern California)
- Basak Guler (University of California, Riverside)
- Dimitris Stripelis (University of Southern California)
- Eugene Bagdasaryan (Cornell University)
- Farzin Haddadpour (Yale University)
- Gerald Penn (University of Torontoy)
- Hongyi Wang (Carnegie Mellon University)
- Jinhyun So (University of Southern California)
- Jun Yan (University of Southern California)
- Kshitiz Malik (Facebook)
- Kevin Hsieh (Microsoft)
- Ninareh Mehrabi (University of Southern California)
- Roozbeh Yousefzadeh (Yale University)
- Saurav Prakash (University of Southern California)
- Shen Li (Facebook)
- Shengyuan Hu (Carnegie Mellon University)
- Sijie Cheng (Fudan University)
- Sunwoo Lee (University of Southern California)
- Tao Yu (The University of Hong Kong)
- Umang Gupta (University of Southern California)
- Xin Dong (Harvard University)
- Xuechen Li (Stanford University)
- Yae Jee Cho (Carnegie Mellon University)
- Zheng Xu (Google)