{"id":3952,"date":"2026-02-27T14:48:43","date_gmt":"2026-02-27T20:48:43","guid":{"rendered":"https:\/\/wordpress.cels.anl.gov\/lans-seminars\/?post_type=tribe_events&#038;p=3952"},"modified":"2026-03-23T11:43:26","modified_gmt":"2026-03-23T16:43:26","slug":"lans-seminar-194","status":"publish","type":"tribe_events","link":"https:\/\/wordpress.cels.anl.gov\/lans-seminars\/event\/lans-seminar-194\/","title":{"rendered":"LANS Seminar"},"content":{"rendered":"<p><strong>Seminar Title<\/strong>: Towards Scalable Federated Learning for Scientific Computing<\/p>\n<p>&nbsp;<\/p>\n<p><strong>Speaker<\/strong>: Yijiang Li, Postdoctoral Appointee, Mathematics and Computer Science Division, Argonne National Laboratory<\/p>\n<p>&nbsp;<\/p>\n<p><strong>Date:<\/strong> Thursday, March 26, 2026<\/p>\n<p><strong>Time:<\/strong> 3:00 PM-4:00 PM (In-Person)<\/p>\n<p><strong>Location:<\/strong>\u00a0Hybrid, Bldg. 240, Conference Room 4301<\/p>\n<p><strong>Description<\/strong>: Federated learning (FL) enables collaborative model training without centralizing sensitive data, but making FL work in practice requires more than better algorithms. It requires algorithms co-designed with the infrastructure they run on. This principle becomes especially apparent in high-performance computing (HPC) environments, where heterogeneous architectures, strict firewall policies, and batch job schedulers introduce challenges qualitatively different from those studied in the existing FL literature. We present a systematic deployment of FL across multiple supercomputers, taking four DOE facilities as a case study representative of general HPC ecosystems. Extensive experiments expose key system-level bottlenecks and reveal that stochastic batch scheduler delays critically affect training performance, \u00a0a challenge for which existing FL algorithms, designed without infrastructure in mind, are poorly suited. This motivates FedQueue, a queue-aware FL protocol that co-designs training and aggregation directly with scheduler behavior, supported by convergence guarantees and demonstrated improvements over baselines in these deployments.<\/p>\n<div class=\"tribe-events-single-event-description tribe-events-content\">\n<p>&nbsp;<\/p>\n<p><b>Bio:<\/b> Yijiang Li is a postdoctoral appointee at the Mathematics and Computer Science Division at Argonne National Laboratory. He received his Ph.D. in Operations Research from Georgia Institute of Technology. His research focuses on scalable and reliable federated learning and scientific computing, spanning algorithm design, theoretical analysis, and large-scale deployment.<\/p>\n<p>&nbsp;<\/p>\n<p><em>Please note that the meeting URL for this event can be seen on the cels-seminars website which requires an Argonne login.<\/em><\/p>\n<p>&nbsp;<\/p>\n<p>See all upcoming talks at\u00a0<a href=\"https:\/\/www.anl.gov\/mcs\/lans-seminars\">https:\/\/www.anl.gov\/mcs\/lans-seminars<\/a><\/p>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>Seminar Title: Towards Scalable Federated Learning for Scientific Computing &nbsp; Speaker: Yijiang Li, Postdoctoral Appointee, Mathematics and Computer Science Division, Argonne National Laboratory &nbsp; Date: Thursday, March 26, 2026 Time: 3:00 PM-4:00 PM (In-Person) Location:\u00a0Hybrid, Bldg. 240, Conference Room 4301 &hellip; <a href=\"https:\/\/wordpress.cels.anl.gov\/lans-seminars\/event\/lans-seminar-194\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":976,"featured_media":0,"template":"","meta":{"_acf_changed":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"_tribe_events_status":"","_tribe_events_status_reason":"","footnotes":""},"tags":[],"tribe_events_cat":[2],"class_list":["post-3952","tribe_events","type-tribe_events","status-publish","hentry","tribe_events_cat-seminar","cat_seminar"],"acf":[],"_links":{"self":[{"href":"https:\/\/wordpress.cels.anl.gov\/lans-seminars\/wp-json\/wp\/v2\/tribe_events\/3952","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/wordpress.cels.anl.gov\/lans-seminars\/wp-json\/wp\/v2\/tribe_events"}],"about":[{"href":"https:\/\/wordpress.cels.anl.gov\/lans-seminars\/wp-json\/wp\/v2\/types\/tribe_events"}],"author":[{"embeddable":true,"href":"https:\/\/wordpress.cels.anl.gov\/lans-seminars\/wp-json\/wp\/v2\/users\/976"}],"version-history":[{"count":3,"href":"https:\/\/wordpress.cels.anl.gov\/lans-seminars\/wp-json\/wp\/v2\/tribe_events\/3952\/revisions"}],"predecessor-version":[{"id":3973,"href":"https:\/\/wordpress.cels.anl.gov\/lans-seminars\/wp-json\/wp\/v2\/tribe_events\/3952\/revisions\/3973"}],"wp:attachment":[{"href":"https:\/\/wordpress.cels.anl.gov\/lans-seminars\/wp-json\/wp\/v2\/media?parent=3952"}],"wp:term":[{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/wordpress.cels.anl.gov\/lans-seminars\/wp-json\/wp\/v2\/tags?post=3952"},{"taxonomy":"tribe_events_cat","embeddable":true,"href":"https:\/\/wordpress.cels.anl.gov\/lans-seminars\/wp-json\/wp\/v2\/tribe_events_cat?post=3952"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}