Training AI models on sensitive data - the Federated Learning approach

Sarbaree Mishra; Vineela Komandla; Srikanth Bandi; Sairamesh Konidala; Jeevan Manda

Training AI models on sensitive data - the Federated Learning approach

Authors

Sarbaree Mishra Program Manager at Molina Healthcare Inc., USA Author
Vineela Komandla Vice President - Product Manager, JP Morgan Author
Srikanth Bandi Software Engineer, JP Morgan Chase, USA Author
Sairamesh Konidala Vice President, JP Morgan & Chase, USA Author
Jeevan Manda Project Manager, Metanoia Solutions Inc, USA Author

Keywords:

Federated Learning, Sensitive Data

Abstract

As artificial intelligence (AI) becomes increasingly integrated into various sectors, training AI models on sensitive data presents opportunities and challenges. Traditional approaches to AI model training rely on centralized systems, where large datasets are gathered and processed in a central server. While this approach has been practical, it raises significant privacy & security concerns, mainly when dealing with sensitive or personally identifiable information. Federated Learning (FL) offers a promising solution to these challenges by enabling AI models to be trained directly on decentralized data sources without transferring sensitive data to a central location. This decentralized approach preserves the privacy of the data, as it remains local to its origin. FL works by aggregating updates to the model from multiple sources rather than raw data, ensuring that data never leaves its original location, thus reducing the risk of data breaches and ensuring compliance with stringent data protection regulations such as GDPR. This article explores the foundational principles behind Federated Learning, including its architecture, core components, & the role of secure aggregation protocols in maintaining confidentiality. It also highlights the growing range of applications for FL, from healthcare and finance to mobile devices, where data privacy is paramount. Furthermore, the article discusses the advantages of FL, such as improved privacy, reduced bandwidth consumption, & enhanced model performance through collaborative learning, while also acknowledging the challenges, including communication efficiency, model synchronization, & the complexities of implementing FL at scale. As the demand for privacy-preserving technologies continues to rise, Federated Learning is a crucial innovation in the responsible development of AI. The conclusion examines the potential of FL to transform industries by enabling organizations to deploy AI in a manner that is both secure & compliant, fostering trust and ethical AI development in an increasingly data-sensitive world.

Downloads

References

Hao, M., Li, H., Luo, X., Xu, G., Yang, H., & Liu, S. (2019). Efficient and privacy-enhanced federated learning for industrial artificial intelligence. IEEE Transactions on Industrial Informatics, 16(10), 6532-6542.

Truex, S., Baracaldo, N., Anwar, A., Steinke, T., Ludwig, H., Zhang, R., & Zhou, Y. (2019, November). A hybrid approach to privacy-preserving federated learning. In Proceedings of the 12th ACM workshop on artificial intelligence and security (pp. 1-11).

Yang, Q., Liu, Y., Chen, T., & Tong, Y. (2019). Federated machine learning: Concept and applications. ACM Transactions on Intelligent Systems and Technology (TIST), 10(2), 1-19.

Bhagoji, A. N., Chakraborty, S., Mittal, P., & Calo, S. (2019, May). Analyzing federated learning through an adversarial lens. In International conference on machine learning (pp. 634-643). PMLR.

Wang, Z., Song, M., Zhang, Z., Song, Y., Wang, Q., & Qi, H. (2019, April). Beyond inferring class representatives: User-level privacy leakage from federated learning. In IEEE INFOCOM 2019-IEEE conference on computer communications (pp. 2512-2520). IEEE.

Li, D., & Wang, J. (2019). Fedmd: Heterogenous federated learning via model distillation. arXiv preprint arXiv:1910.03581.

Hard, A., Rao, K., Mathews, R., Ramaswamy, S., Beaufays, F., Augenstein, S., ... & Ramage, D. (2018). Federated learning for mobile keyboard prediction. arXiv preprint arXiv:1811.03604.

Brisimi, T. S., Chen, R., Mela, T., Olshevsky, A., Paschalidis, I. C., & Shi, W. (2018). Federated learning of predictive models from federated electronic health records. International journal of medical informatics, 112, 59-67.

Bonawitz, K. (2019). Towards federated learning at scale: Syste m design. arXiv preprint arXiv:1902.01046.

Nishio, T., & Yonetani, R. (2019, May). Client selection for federated learning with heterogeneous resources in mobile edge. In ICC 2019-2019 IEEE international conference on communications (ICC) (pp. 1-7). IEEE.

Yang, T., Andrew, G., Eichner, H., Sun, H., Li, W., Kong, N., ... & Beaufays, F. (2018). Applied federated learning: Improving google keyboard query suggestions. arXiv preprint arXiv:1812.02903.

Wang, X., Han, Y., Wang, C., Zhao, Q., Chen, X., & Chen, M. (2019). In-edge ai: Intelligentizing mobile edge computing, caching and communication by federated learning. Ieee Network, 33(5), 156-165.

Geyer, R. C., Klein, T., & Nabi, M. (2017). Differentially private federated learning: A client level perspective. arXiv preprint arXiv:1712.07557.

Jiang, Y., Konečný, J., Rush, K., & Kannan, S. (2019). Improving federated learning personalization via model agnostic meta learning. arXiv preprint arXiv:1909.12488.

Lu, Y., Huang, X., Dai, Y., Maharjan, S., & Zhang, Y. (2019). Blockchain and federated learning for privacy-preserved data sharing in industrial IoT. IEEE Transactions on Industrial Informatics, 16(6), 4177-4186.

Gade, K. R. (2017). Integrations: ETL vs. ELT: Comparative analysis and best practices. Innovative Computer Sciences Journal, 3(1).

Gade, K. R. (2017). Migrations: Challenges and Best Practices for Migrating Legacy Systems to Cloud-Based Platforms. Innovative Computer Sciences Journal, 3(1).

Komandla, V. Transforming Financial Interactions: Best Practices for Mobile Banking App Design and Functionality to Boost User Engagement and Satisfaction.

Komandla, V. Enhancing Security and Fraud Prevention in Fintech: Comprehensive Strategies for Secure Online Account Opening.

Gade, K. R. (2018). Real-Time Analytics: Challenges and Opportunities. Innovative Computer Sciences Journal, 4(1).

Downloads

Published

02-04-2020

Issue

Vol. 6 (2020): Distributed Learning and Broad Applications in Scientific Research

Section

Articles

License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

License Terms

Ownership and Licensing:

Authors of research papers submitted to Distributed Learning and Broad Applications in Scientific Research retain the copyright of their work while granting the journal certain rights. Authors maintain ownership of the copyright and have granted the journal a right of first publication. Simultaneously, authors agree to license their research papers under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) License.

License Permissions:

Under the CC BY-NC-SA 4.0 License, others are permitted to share and adapt the work, as long as proper attribution is given to the authors and acknowledgement is made of the initial publication in the journal. This license allows for the broad dissemination and utilization of research papers.

Additional Distribution Arrangements:

Authors are free to enter into separate contractual arrangements for the non-exclusive distribution of the journal's published version of the work. This may include posting the work to institutional repositories, publishing it in journals or books, or other forms of dissemination. In such cases, authors are requested to acknowledge the initial publication of the work in this journal.

Online Posting:

Authors are encouraged to share their work online, including in institutional repositories, disciplinary repositories, or on their personal websites. This permission applies both prior to and during the submission process to the journal. Online sharing enhances the visibility and accessibility of the research papers.

Responsibility and Liability:

Authors are responsible for ensuring that their research papers do not infringe upon the copyright, privacy, or other rights of any third party. Scientific Research Canada disclaims any liability or responsibility for any copyright infringement or violation of third-party rights in the research papers.

If you have any questions or concerns regarding these license terms, please contact us at editor@dlabi.org.

Most read articles by the same author(s)

1 2 > >>

Training AI models on sensitive data - the Federated Learning approach

Authors

Keywords:

Abstract

Downloads

References

Downloads

Published

Issue

Section

License

License Terms

Ownership and Licensing:

License Permissions:

Additional Distribution Arrangements:

Online Posting:

Responsibility and Liability:

Most read articles by the same author(s)

Similar Articles

Journal Snapshot

Make a Submission

Invitation for Submissions