{"id":7152,"date":"2026-04-02T08:33:14","date_gmt":"2026-04-02T08:33:14","guid":{"rendered":"https:\/\/www.proefschriftmaken.nl\/portfolio\/e-j-j-wijler\/"},"modified":"2026-04-02T08:33:20","modified_gmt":"2026-04-02T08:33:20","slug":"e-j-j-wijler","status":"publish","type":"us_portfolio","link":"https:\/\/www.proefschriftmaken.nl\/en\/portfolio\/e-j-j-wijler\/","title":{"rendered":"E.J.J. Wijler"},"content":{"rendered":"","protected":false},"excerpt":{"rendered":"","protected":false},"author":8,"featured_media":7153,"comment_status":"closed","ping_status":"closed","template":"","meta":{"_acf_changed":false,"footnotes":""},"us_portfolio_category":[45],"class_list":["post-7152","us_portfolio","type-us_portfolio","status-publish","has-post-thumbnail","hentry","us_portfolio_category-new-template"],"acf":{"naam_van_het_proefschift":"High-Dimensional Time Series Analysis","samenvatting":"We bevinden ons momenteel in een nieuw tijdperk van data-analyse, dat gekarakteriseerd wordt door de beschikbaarheid van grote, ongestructureerde datasets. U kunt hierbij denken aan data die wordt verzameld door grote tech-bedrijven zoals Google en Facebook, maar ook gegevens die verzameld worden via de klantenkaart van de lokale supermarket en de betaalpas waarmee afgerekend wordt. Omdat traditionele statistische modellen vaak het beste werken wanneer er rekening gehouden dient te worden met de effecten van slechts enkele variabelen, zijn er de laatste jaren veel nieuwe statistische methoden ontwikkelt die beter toepasbaar zijn op grote datasets. Deze nieuwe methoden worden ook wel hoog-dimensionale statistieken genoemd. Echter, binnen economische en financi\u00eble sectoren, werkt men met name met tijdreeksen, zoals bijvoorbeeld de Nederlandse werkloosheidcijfers of het bruto binnenlands product. Tijdreeksen vertonen vaak unieke eigenschappen, zoals trendmatig gedrag waarbij toekomstige waardes sterk afhangen van het verleden, waarvan we weten dat ze de uitkomsten van traditionele statistieken sterk be\u00efnvloeden. Het is daarom niet verstandig om hoog-dimensionale statistieken toe te passen op grote verzamelingen van tijdreeksen zonder theoretische verificatie of praktische aanpassingen. Dit onderwerp staat centraal in mijn proefschrift.\n\nIn dit proefschrift, richten we ons enkel op statistische methoden welke onder te verdelen zijn in drie algemene categorie\u00ebn: (1) factor modellen, (2) geregulariseerde regressie en (3) hybride modellen. Het idee achter factormodellen is dat alle waargenomen variabelen worden aangedreven door enkele latente (niet geobserveerde) variabelen. Zo kunnen we bijvoorbeeld werkloosheid observeren binnen verschillende industrie\u00ebn, of rentetarieven voor verschillende looptijden, maar worden al deze variabelen mogelijk (deels) verklaard door de onderliggende bedrijfsconjunctuur. Factor modellen proberen deze latente variabelen, de factoren, te schatten en daarmee de data samen te vatten met een minimum verlies aan informatie. Op deze manier hoeft er geen complex model met honderden geobserveerde variabelen geschat te worden. Een alternatieve methode is om de data niet samen te vatten, maar om ervan uit te gaan dat veel variabelen simpelweg irrelevant zijn voor het verklaren van de afhankelijke variabele waar men in ge\u00efnteresseerd is. Zo is het aannemelijk dat de grondstofprijzen voor thee van invloed zijn op de verkoop van koffie, maar dat de grondstofprijzen voor ketchup hier weining in verklaren. Voor dit soort applicaties is geregulariseerde regressie uitermate geschikt. Deze vorm van regressie schat een lineair model en zorgt er automatisch voor dat de geschatte bijdrages van irrelevante variabelen omlaag geschaald worden. Sommige vormen van geregulariseerde regressie, zoals de Least Absolute Shrinkage and Selection Operator (LASSO) welke een belangrijke rol in dit proefschrift heeft, hebben de wenselijke eigenschap dat ze irrelevante variabelen geheel automatisch uit het geschatte model kunnen verwijderen. Als laatste optie komen in dit proefschrift hybride methoden aan bod, welke irrelevante variabelen verwijderen en de relevante variabelen middels het schatten van factoren samenvatten.\n\nIn Hoofdstuk 2 vergelijken we de voorspellingsprestaties van statistische methoden welke onder te verdelen zijn middels de bovenstaande categorisatie. Door het uitvoeren van gecontroleerde simulaties waarin we bepaalde data eigenschappen doelbewust vastleggen, vinden we dat factor modellen en geregulariseerde regressie goed presteren in het kader waar ze voor ontwikkeld zijn, maar ontdekken we ook dat geregulariseerde regressie beter kan voorspellen indien er factoren in de data aanwezig zijn met \u201cveel ruis\u201d. In een empirische toepassing vinden we dan ook dat voor sommige Amerikaanse economische indicatoren geregulariseerde regressie nauwkeuriger voorspelt dan factor modellen, ondanks dat de aanwezigheid van factoren in een macro-economische toepassing zeer aannemelijk is.\n\nGemotiveerd door de gunstige prestaties van geregulariseerde regressie, ontwikkelen we in Hoofdstuk 3 de Single-equation Penalized Error-Correction Selector (SPECS). SPECS is een gespecializeerde methode waarmee geregulariseerde lineaire modellen geschat kunnen worden die rekening houden met het trendmatige gedrag van de beschouwde variabelen. Zo komt het in economische toepassingen geregeld voor dat individuele variabelen een stochastische (willekeurige) trend bevatten, maar dat deze trend verdwijnt na het nemen van een bepaalde lineaire combinatie. Dit welbekende fenomeen heet cointegratie en heeft grote invloed op het gedrag van statistieken. Wij leiden theoretische (asymptotische) resultaten af die laten zien dat onze methode zich wenselijk gedraagt wanneer de steekproefgrootte groeit. Ter demonstratie van de toepasbaarheid van SPECS, gebruiken we onze nieuwe methode om de werkloosheid in Nederland te voorspellen aan de hand van de populariteit van 100 verschillende Google zoektermen, waaronder bijvoorbeeld \u201cwerkloosheidsuitkering\u201d en \u201csolliciteren\u201d. In lijn der verwachtingen, overtreft SPECS de voorspellingsprestaties van hoog-dimensionale statistieken welke cointegratie negeren.\n\nIn Hoofdstuk 4 leiden we vergelijkbare theoretische resultaten af onder minder restrictieve aannames. Zo laten we toe dat het aantal variabelen in het model mag toenemen wanneer de steekproefgrootte toeneemt. Dit is van belang om een duidelijk inzicht te geven in het gedrag van SPECS bij toepassingen op datasets met een groot aantal variabelen.\n\nTen slotte, in Hoofdstuk 5 vergelijken we (1) statistische testen om het trendmatig gedrag van tijdreeksen te classiferen en (2) een selectie aan hoog-dimensionale voorspellingsmethoden welke cointegratie al dan niet in acht nemen. Middels simulaties vinden we dat het uitermate belangrijk is om de trend in de afhankelijke variabele juist te classificeren, gezien de nauwkeurigheid waarmee deze variabele voorspeld kan worden sterk van deze classificatie afhangt. In een macro-economische toepassing op een Amerikaanse dataset vinden we dat geen enkel model consistent het nauwkeurigst voorspelt en is er ook geen definitief antwoord op de vraag of cointegratie belangrijk is voor het maken van voorspellingen. Gezien er gevallen zijn waarin SPECS beter presteert dan de andere methodes in de vergelijking, bevestigen we dat onze methode zowel theoretische als toegepaste waarde heeft. Echter, zal de keuze voor de optimale methode altijd van de specifieke toepassing afhankelijk zijn.","summary":"We are currently in a new era of data analysis, characterized by the availability of large, unstructured datasets. These include data collected by major tech companies like Google and Facebook, as well as data gathered via local supermarket loyalty cards and debit card transactions. Since traditional statistical models often work best when accounting for the effects of only a few variables, many new statistical methods better suited for large datasets have been developed in recent years. These new methods are also called high-dimensional statistics. However, within economic and financial sectors, one primarily works with time series, such as Dutch unemployment figures or gross domestic product. Time series often exhibit unique properties, such as trend behavior where future values depend strongly on the past, which we know significantly influence the outcomes of traditional statistics. It is therefore unwise to apply high-dimensional statistics to large collections of time series without theoretical verification or practical adaptations. This topic is central to my dissertation.\n\nIn this dissertation, we focus solely on statistical methods that can be subdivided into three general categories: (1) factor models, (2) regularized regression, and (3) hybrid models. The idea behind factor models is that all observed variables are driven by a few latent (unobserved) variables. For example, we can observe unemployment within different industries or interest rates for different maturities, but all these variables may be (partially) explained by the underlying business cycle. Factor models attempt to estimate these latent variables, the factors, and thereby summarize the data with a minimum loss of information. In this way, a complex model with hundreds of observed variables does not need to be estimated. An alternative method is not to summarize the data, but to assume that many variables are simply irrelevant for explaining the dependent variable of interest. For example, it is plausible that tea prices affect coffee sales, but ketchup prices explain very little. For these types of applications, regularized regression is highly suitable. This form of regression estimates a linear model and automatically ensures that the estimated contributions of irrelevant variables are scaled down. Some forms of regularized regression, such as the Least Absolute Shrinkage and Selection Operator (LASSO), which plays an important role in this dissertation, have the desirable property that they can remove irrelevant variables from the estimated model entirely automatically. Hybrid methods are discussed as a final option, removing irrelevant variables and summarizing relevant variables through factor estimation.\n\nIn Chapter 2, we compare the forecasting performance of statistical methods subdivided using the categorization above. By conducting controlled simulations where we purposefully establish certain data properties, we find that factor models and regularized regression perform well in their intended contexts. However, we also discover that regularized regression can forecast better if factors are present in the data with \"a lot of noise.\" in an empirical application, we find that regularized regression forecasts more accurately than factor models for some American economic indicators, despite the likely presence of factors in a macroeconomic context.\n\nMotivated by the favorable performance of regularized regression, in Chapter 3 we develop the Single-equation Penalized Error-Correction Selector (SPECS). SPECS is a specialized method for estimating regularized linear models that account for the trend behavior of the variables under consideration. In economic applications, it frequently happens that individual variables contain a stochastic (random) trend, but this trend disappears after taking a specific linear combination. This well-known phenomenon is called cointegration and has a significant impact on statistical behavior. We derive theoretical (asymptotic) results showing that our method behaves desirably as sample sizes grow. To demonstrate the applicability of SPECS, we use our new method to forecast unemployment in the Netherlands using the popularity of 100 different Google search terms, such as \"unemployment benefit\" and \"applying for a job.\" As expected, SPECS outperforms the forecasting performance of high-dimensional statistics that ignore cointegration.\n\nIn Chapter 4, we derive similar theoretical results under less restrictive assumptions. For example, we allow the number of variables in the model to increase as the sample size increases. This is important for providing clear insight into the behavior of SPECS when applied to datasets with a large number of variables.\n\nFinally, in Chapter 5, we compare (1) statistical tests to classify the trend behavior of time series and (2) a selection of high-dimensional forecasting methods that do or do not account for cointegration. Simulations show it is extremely important to correctly classify the trend in the dependent variable, as the accuracy of the forecast heavily depends on this classification. In a macroeconomic application to an American dataset, we find that no single model is consistently most accurate, and there is no definitive answer to whether cointegration is important for making forecasts. Given there are cases where SPECS performs better than other methods in the comparison, we confirm that our method has both theoretical and applied value. However, the choice of the optimal method will always depend on the specific application.","auteur":"E.J.J. Wijler","auteur_slug":"ejj-wijler","publicatiedatum":"27 maart 2020","taal":"NL","url_flipbook":"https:\/\/ebook.proefschriftmaken.nl\/ebook\/ejjwijler?iframe=true","url_download_pdf":"https:\/\/ebook.proefschriftmaken.nl\/download\/03abdbca-bb69-4068-a364-a26ea6abbbb6\/optimized","url_epub":"","ordernummer":"FTP-202604020828","isbn":"978-94-6380-745-6","doi_nummer":"","naam_universiteit":"Universiteit Maastricht","afbeeldingen":7154,"naam_student:":"","binnenwerk":"","universiteit":"Universiteit Maastricht","cover":"","afwerking":"","cover_afwerking":"","design":""},"_links":{"self":[{"href":"https:\/\/www.proefschriftmaken.nl\/en\/wp-json\/wp\/v2\/us_portfolio\/7152","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.proefschriftmaken.nl\/en\/wp-json\/wp\/v2\/us_portfolio"}],"about":[{"href":"https:\/\/www.proefschriftmaken.nl\/en\/wp-json\/wp\/v2\/types\/us_portfolio"}],"author":[{"embeddable":true,"href":"https:\/\/www.proefschriftmaken.nl\/en\/wp-json\/wp\/v2\/users\/8"}],"replies":[{"embeddable":true,"href":"https:\/\/www.proefschriftmaken.nl\/en\/wp-json\/wp\/v2\/comments?post=7152"}],"version-history":[{"count":1,"href":"https:\/\/www.proefschriftmaken.nl\/en\/wp-json\/wp\/v2\/us_portfolio\/7152\/revisions"}],"predecessor-version":[{"id":7155,"href":"https:\/\/www.proefschriftmaken.nl\/en\/wp-json\/wp\/v2\/us_portfolio\/7152\/revisions\/7155"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.proefschriftmaken.nl\/en\/wp-json\/wp\/v2\/media\/7153"}],"wp:attachment":[{"href":"https:\/\/www.proefschriftmaken.nl\/en\/wp-json\/wp\/v2\/media?parent=7152"}],"wp:term":[{"taxonomy":"us_portfolio_category","embeddable":true,"href":"https:\/\/www.proefschriftmaken.nl\/en\/wp-json\/wp\/v2\/us_portfolio_category?post=7152"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}