{"id":11960,"date":"2026-04-22T14:36:41","date_gmt":"2026-04-22T14:36:41","guid":{"rendered":"https:\/\/www.proefschriftmaken.nl\/portfolio\/tianqi-zhao\/"},"modified":"2026-04-22T14:36:47","modified_gmt":"2026-04-22T14:36:47","slug":"tianqi-zhao","status":"publish","type":"us_portfolio","link":"https:\/\/www.proefschriftmaken.nl\/en\/portfolio\/tianqi-zhao\/","title":{"rendered":"Tianqi Zhao"},"content":{"rendered":"","protected":true},"excerpt":{"rendered":"","protected":true},"author":7,"featured_media":11963,"comment_status":"closed","ping_status":"closed","template":"","meta":{"_acf_changed":false,"footnotes":""},"us_portfolio_category":[45],"class_list":["post-11960","us_portfolio","type-us_portfolio","status-publish","post-password-required","hentry","us_portfolio_category-new-template"],"acf":{"naam_van_het_proefschift":"CHARACTERIZING LEARNING DIFFICULTY IN GRAPH-STRUCTURED DATA","samenvatting":"Grafen bieden een natuurlijke manier om zowel de attributen van individuele entiteiten als de structuur van hun onderlinge verbindingen weer te geven, wat hen een krachtig raamwerk maakt voor het modelleren van complexe systemen met ingewikkelde relaties. Dergelijke netwerken komen voor in diverse domeinen, waaronder sociale netwerken, biologie, kwantumfysica en kennisgrafen. Denk bijvoorbeeld aan eiwitten en hun interacties in eiwit-eiwitnetwerken, of gebruikers en vriendschappen in sociale netwerken.\n\nIn de afgelopen jaren zijn Graph Neural Networks (GNN's) het dominante paradigma geworden voor het leren van graaf-gestructureerde data, waarbij sterke resultaten zijn behaald in taken zoals knooppuntclassificatie, linkvoorspelling en graafclassificatie op academische benchmarkdatasets. Een nadere beschouwing van een van de scenario's uit de praktijk, multi-label node classification (MLNC) datasets, onthult echter complexere attribuutdistributies en aanzienlijke problemen met de datakwaliteit waarbij GNN's niet goed kunnen leren. Veel grafen uit de echte wereld zijn ruizig, vertonen ongebalanceerde labeldistributies en tonen een lage label-homofilie, wat de extractie van zinvolle informatie uit lokale buurten bemoeilijkt. Deze observatie roept een belangrijke vraag op: in welke mate bepalen de structurele en distributieve eigenschappen van graafdata de prestaties van het model?\n\nOm dit te onderzoeken werden bestaande multi-label graafdatasets systematisch geanalyseerd en werd een synthetische graafgenerator met instelbare parameters ontwikkeld om een gecontroleerde verkenning van sleutelfactoren zoals homofilie, graadverdeling en labeltoewijzing mogelijk te maken. Met dit fundament ontstond een vervolgvraag: hoe goed presteren state-of-the-art GNN's daadwerkelijk onder de uitdagende omstandigheden van multi-label grafen?\n\nEmpirische evaluatie bracht consistente tekortkomingen van bestaande methoden aan het licht, met name hun moeite met het ontwarren van overlappende labelsignalen. Dit motiveerde het ontwerp van GNN-MultiFix, dat feature-propagatie integreert met label-propagatie en positionele codering om de specifieke uitdagingen van MLNC aan te pakken. Hoewel effectief in statische grafen, breidt deze onderzoekslijn zich natuurlijk uit naar dynamische omgevingen: als deze uitdagingen aanhouden in statische multi-label grafen, hoe worden ze dan versterkt wanneer de graaf zelf in de loop van de tijd evolueert?\n\nDeze overweging leidt tot de setting van Continual Graph Learning (CGL), waarbij zowel de graafstructuur als de labelruimte dynamisch evolueren. Naast de moeilijkheden van ruizige data en lage homofilie, moeten modellen in deze setting opboksen tegen catastrofaal vergeten naarmate nieuwe taken arriveren. Om deze uitdagingen systematisch te bestuderen, werd AGALE voorgesteld als een graafbewust evaluatieraamwerk voor continu leren, dat een principi\u00eble manier biedt om modellen te benchmarken onder evoluerende graafomstandigheden.\n\nTen slotte benadrukken de studies van statische en continue leeromgevingen een breder vraagstuk: hoe moeten graafmodellen uitgebreider worden ge\u00ebvalueerd dan via nauwe benchmarks en metrieken? Deze vraag motiveert een datacentrisch perspectief op graaf-machine learning, waarbij de rol van graafeigenschappen op knooppuntniveau bij het vormgeven van prestaties wordt benadrukt en er wordt opgeroepen tot evaluatieraamwerken die niet alleen nauwkeurigheid meten, maar ook de wisselwerking tussen datakenmerken en de betrouwbaarheid van de modelvoorspelling verduidelijken.","summary":"Graphs provide a natural way to represent both the attributes of individual entities and the structure of their interconnections, making them a powerful framework for modeling complex systems with intricate relationships. Such networks appear across diverse domains, including social networks, biology, quantum physics, and knowledge graphs. For example, proteins and their interactions in protein\u2013protein networks, or users and friendships in social networks.\n\nIn recent years, Graph Neural Networks (GNNs) have become the dominant paradigm for learning from graph-structured data, achieving strong results in tasks such as node classification, link prediction, and graph classification on academic benchmark datasets. However, a closer examination of one of the real-world scenario, multi-label node classification (MLNC) datasets, reveals more complicated attribute distributions and substantial data quality issues where GNNs fail to learn. Many real-world graphs are noisy, exhibit unbalanced label distributions, and display low label homophily, which complicates the extraction of meaningful information from local neighborhoods. This observation raises an important question: to what extent do the structural and distributional properties of graph data shape the performance of the model?\n\nTo investigate this, existing multi-label graph datasets were systematically analyzed, and a synthetic graph generator with tunable parameters was developed to enable controlled exploration of key factors such as homophily, degree distribution, and label assignment. With this foundation, a subsequent question emerged: how well do state-of-the-art GNNs actually perform under the challenging conditions of multi-label graphs?\n\nEmpirical evaluation revealed consistent shortcomings of existing methods, particularly their difficulty in disentangling overlapping label signals. This motivated the design of GNN-MultiFix, which integrates feature propagation with label propagation and positional encoding to address the specific challenges of MLNC. While effective in static graphs, this line of inquiry naturally extends to dynamic settings: if these challenges persist in static multi-label graphs, how are they amplified when the graph itself evolves over time?\n\nThis consideration leads to the setting of Continual Graph Learning (CGL), where both graph structure and label space evolve dynamically. Beyond the difficulties of noisy data and low homophily, models in this setting must contend with catastrophic forgetting as new tasks arrive. To systematically study these challenges, AGALE was proposed as a graph-aware continual learning evaluation framework, providing a principled way to benchmark models under evolving graph conditions.\n\nFinally, the studies of static and continual learning setting highlight a broader issue: how should graph models be evaluated more comprehensively beyond narrow benchmarks and metrics? This question motivates a data-centric perspective on graph machine learning, emphasizing the role of instance-level graph properties in shaping performance and calling for evaluation frameworks that not only measure accuracy but also illuminate the interplay between data characteristics, and reliability of the model prediction.","auteur":"Tianqi Zhao","auteur_slug":"tianqi-zhao","publicatiedatum":"12 mei 2026","taal":"EN","url_flipbook":"https:\/\/ebook.proefschriftmaken.nl\/ebook\/tianqizhao?iframe=true","url_download_pdf":"https:\/\/ebook.proefschriftmaken.nl\/download\/b9940f1b-5d8f-4404-bb7c-1656d44e6d32\/optimized","url_epub":"","ordernummer":"18912","isbn":"978-94-6518-037-3","doi_nummer":"","naam_universiteit":"Overig","afbeeldingen":11964,"naam_student:":"","binnenwerk":"","universiteit":"Overig","cover":"","afwerking":"","cover_afwerking":"","design":""},"_links":{"self":[{"href":"https:\/\/www.proefschriftmaken.nl\/en\/wp-json\/wp\/v2\/us_portfolio\/11960","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.proefschriftmaken.nl\/en\/wp-json\/wp\/v2\/us_portfolio"}],"about":[{"href":"https:\/\/www.proefschriftmaken.nl\/en\/wp-json\/wp\/v2\/types\/us_portfolio"}],"author":[{"embeddable":true,"href":"https:\/\/www.proefschriftmaken.nl\/en\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/www.proefschriftmaken.nl\/en\/wp-json\/wp\/v2\/comments?post=11960"}],"version-history":[{"count":1,"href":"https:\/\/www.proefschriftmaken.nl\/en\/wp-json\/wp\/v2\/us_portfolio\/11960\/revisions"}],"predecessor-version":[{"id":11965,"href":"https:\/\/www.proefschriftmaken.nl\/en\/wp-json\/wp\/v2\/us_portfolio\/11960\/revisions\/11965"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.proefschriftmaken.nl\/en\/wp-json\/wp\/v2\/media\/11963"}],"wp:attachment":[{"href":"https:\/\/www.proefschriftmaken.nl\/en\/wp-json\/wp\/v2\/media?parent=11960"}],"wp:term":[{"taxonomy":"us_portfolio_category","embeddable":true,"href":"https:\/\/www.proefschriftmaken.nl\/en\/wp-json\/wp\/v2\/us_portfolio_category?post=11960"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}