FL4NLP-ACL2022, HOME

About

FL4NLP @ ACL 2022

Welcome to 1^st FL4NLP Workshop co-located with ACL 2022!

Due to increasing concerns and regulations about data privacy (e.g., General Data Protection Regulation), coupled with the growing computational power of edge devices, emerging data from realistic users have become much more fragmented, forming distributed private datasets across different clients (i.e., organizations or personal devices). Respecting users’ privacy and restricted by these regulations, we have to assume that users’ data in a client are not allowed to transfer to a centralized server or other clients. For example, a hospital does not want to share its private data (e.g., conversations, questions asked on its website/app) with other hospitals. This is despite the fact that models trained by a centralized dataset (i.e., combining data from all clients) usually enjoy better performance on downstream tasks (e.g., dialogue, question answering). Therefore, it is of vital importance to study NLP problems in such a scenario, where data are distributed across different isolated organizations or remote devices and cannot be shared for privacy concerns.

The field of federated learning (FL) aims to enable many individual clients to jointly train their models, while keeping their local data decentralized and completely private from other users or a centralized server. A common training schema of FL methods is that each client sends its model parameters to the server, which updates and sends back the global model to all clients in each round. Since the raw data of one client has never been exposed to others, FL is promising to be an effective way to address the above challenges, particularly in the NLP domain where many user-generated text data contain sensitive, personal information.

Topics of interests include but not limited to

Federated learning methods for NLP tasks and models (e.g., Transformer-based LMs, dialog systems, etc).
New learning frameworks to tackle data heterogeneity, label deficiency, data shift, generalization ability related issues in FL for NLP, including continual learning, multi-task learning, self/semi/un-supervised learning, etc.
Efficient training methods for resource-constrained on-device NLP, including training-time compression, communication/computation/memory-efficient methods.
Security and privacy for FL for NLP, including new attack methods (e.g., data and model poisoning) and defense methods (e.g., empirical and certifiable defenses), robust aggregation methods, differential privacy (DP), HE (Homomorphic Encryption), etc.
Fair FL for NLP, including introducing different fairness notions in FL, mitigating different types of biases in different NLP applications in FL settings, introducing benchmark datasets and tasks for fair FL in NLP applications along with auditing different NLP applications in FL settings.
Interpretability of FL for NLP, especially understanding how NLP models work in data heterogeneity.
Scalability of FL4NLP: e.g., client sampling algorithms.
Benchmarking datasets (with realistic non-I.I.D. partitions) and new applications in NLP and beyond.

Program

Workshop Program

Please join us on Slack (#acl2022-fl4nlp-workshop).

Underline: LINK.

Zoom: https://us06web.zoom.us/j/87819510674?pwd=UEVleW5JSVpkUXJBQlR2TEtRV1hRZz09

Start Time - Dublin Time (Pacific Time) on May 27, 2022
09:00AM (1:00AM)	Introduction and Opening Remarks
09:10AM (1:10AM)	Invited Talk #1: Manzil Zaheer: "Federated Optimization in NLP"
10:05AM (2:05AM)	Invited Talk #2: Rahul Gupta: "Federated learning for industrial systems"
11:05AM (3:05AM)	Invited Talk #3: Salman Avestimehr: "Secure, Scalable, and Efficient Federated Learning"
12:00 (4:00AM)	Lunch Break
12:35 (4:35AM)	Contributed Talk #1: Backdoor Attacks in Federated Learning by Poisoned Word Embeddings
12:50 (4:50AM)	Contributed Talk #2: Efficient Federated Learning on Knowledge Graphs via Privacy-preserving Relation Embedding Aggregation
13:05 (5:05AM)	Contributed Talk #3: Pretrained Models for Multilingual Federated Learning
13:20 (5:20AM)	Contributed Talk #4: Adaptive Differential Privacy for Language Model Training
13:35 (5:35AM)	Invited Talk #4: Tong Zhang: "On the Unreasonable Effectiveness of Federated Averaging with Heterogeneous Data"
14:30 (6:30AM)	Invited Talk #5: Bo Li: "Trustworthy Federated Learning"
15:25 (7:25AM)	Invited Talk #6: Virginia Smith: "Privacy Meets Heterogeneity"
16:20 (8:20AM)	Contributed Talk #5: ActPerFL: Active Personalized Federated Learning
16:35 (8:35AM)	Contributed Talk #6: Scaling Language Model Size in Cross-Device Federated Learning
16:50 (8:50AM)	Contributed Talk #7: Intrinsic Gradient Compression for Scalable and Efficient Federated Learning
17:05 (9:05AM)	Contributed Talk #8: Training a Tokenizer for Free with Private Federated Learning
17:20 (9:20AM)	Contributed Talk #9: UserIdentifier: Implicit User Representations for Simple and Effective Personalized Sentiment Analysis
17:35 (9:35AM)	Panel Discussion: Tom Diethe, Xu Zheng, Gauri Joshi, Anna Rumshisky
18:30 (10:30AM)	Concluding Remarks

Accepted Papers

Archival Papers

[2] ActPerFL: Active Personalized Federated Learning
Authors: Huili Chen, Jie Ding, Eric William Tramel, Shuang Wu, Anit Kumar Sahu, Salman Avestimehr, Tao Zhang
[4] Scaling Language Model Size in Cross-Device Federated Learning
Authors: Jae Hun Ro, Theresa Breiner, Lara McConnaughey, Mingqing Chen, Ananda Theertha Suresh, Shankar Kumar, Rajiv Mathews
[6] Adaptive Differential Privacy for Language Model Training
Authors: Xinwei Wu, Li Gong, Deyi Xiong
[7] Intrinsic Gradient Compression for Scalable and Efficient Federated Learning
Authors: Luke Melas-Kyriazi, Franklyn Wang

Non-Archival Papers

[1] Backdoor Attacks in Federated Learning by Poisoned Word Embeddings
Authors: KiYoon Yoo, Nojun Kwak
[3] UserIdentifier: Implicit User Representations for Simple and Effective Personalized Sentiment Analysis
Authors: Fatemehsadat Mireshghallah, Vaishnavi Shrivastava, Milad Shokouhi, Taylor Berg-Kirkpatrick, Robert Sim, Dimitrios Dimitriadis
[5] Efficient Federated Learning on Knowledge Graphs via Privacy-preserving Relation Embedding Aggregation
Authors: Kai Zhang, Yu Wang, Hongyi Wang, Lifu Huang, Carl Yang, Lichao Sun
[8] Training a Tokenizer for Free with Private Federated Learning
Authors: Eugene Bagdasaryan, Congzheng Song, Rogier van Dalen, Matt Seigel, Áine Cahill
[10] Pretrained Models for Multilingual Federated Learning
Authors: Orion Weller, Marc Marone, Vladimir Braverman, Dawn Lawrie, Benjamin Van Durme

Calls

Call for Papers

Important Dates (Updated!!!)

Regular submission deadline (never-published work): ~~Feb 28~~ Mar 7, 2022
ARR submission deadline (submissions with ARR reviews): Mar 21, 2022
Notification of Acceptance (for regular and ARR submissions): March 26, 2022
Published submission deadline (published at other venues): April 8, 2022
Camera-ready papers due: April 10, 2022
Workshop Dates: May 27
All deadlines are AoE time.

Submission Instructions

We solicit two categories of papers.

Workshop papers (regular/ARR): describing new, previously unpublished research in this field. The submissions should follow the ACL-ARR style guidelines. We accept both short (4 pages of content) and long papers (8 pages of content). Submissions will be subject to a double-blind review process (i.e. they need to be anonymized). Final versions of accepted papers will be allowed 1 additional page of content so that reviewer comments can be taken into account.
Please fill this google form to submit your papers: https://forms.gle/AToY6HYZ6buydSVv5

Published papers: papers on topics relevant to the workshop theme, previously published at NLP or ML conferences. These papers can be submitted in their original format without hiding the author names. Submissions will be reviewed for fit to the workshop topics.

In both categories, accepted papers will be:

can be non-archival
published on the workshop website
presented at the workshop as a lightning talk

Cross-submission Policy: As long as it doesn't conflict with the cross-submission policies of the other venue (e.g., the ARR policy), you may submit the paper to CSRR as a regular workshop paper. Please feel free to email us if you are not sure about your case!

Please submit your paper via Openreview: https://openreview.net/group?id=aclweb.org/ACL/2022/Workshop/FL4NLP

Amazon Best (Student) Paper Awards

There will be a best paper award and a best student paper award for honoring exceptional papers published at the FL4NLP workshop, which are both sponsored by Amazon Alexa AI.