
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 12 Issue: 12 | Dec 2025 www.irjet.net p-ISSN: 2395-0072
![]()

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 12 Issue: 12 | Dec 2025 www.irjet.net p-ISSN: 2395-0072
Akilabanu Chikkumbi1 , Manu S. K2 , Renuka R. U3 , Sadiya Md.D4 , Ruchita R. P5
1Assistant Professor, S G Balekundri Institute of Technology, Belagavi, Karnataka,India
2Student, S G Balekundri Institute of Technology, Belagavi, Karnataka, India
3Student, S G Balekundri Institute of Technology, Belagavi, Karnataka, India
4Student, SG Balekundri Institute of Technology, Belagavi, Karnataka, India
5Student, S G Balekundri Institute of Technology, Belagavi, Karnataka, India ***
Abstract – influence of external variables. It can effectively analyses short-, medium-, and long-term temporal patterns, making it Websites and online services often experience sudden spikes in traffic, especially during events such as product sales, result announcements, or special offers. If too many people use the system at once, it may run slowly, crash, or stop working. To address this challenge, we apply time- series–based web traffic forecasting using Random Forest and SARIMAX algorithms. These models leverage historical traffic patterns to accurately predict future load across different time intervals. The experimental results show that forecasting upcoming traffic helps service providers prepare resources in advance, optimize server capacity, and reduce the risk of traffic congestion or downtime. This predictive approach enhances system reliability and ensures a smoother user experience during peak access periods.
By integrating Random Forest and SARIMAX for web traffic forecasting, service providers can proactively manage system load and prevent unexpected failures. This blended approach allows the system to generate better predictions and manage its resources more smoothly, which is especially helpful for online platforms that handle a lot of activity.
Keywords Web TrafficForecasting,RandomForest, SARIMAX,Time-Series Prediction,LoadManagement,Peak Traffic Analysis.
Moderndigitalplatformsincreasinglyrelyonaccurateforecastingtoensuresmoothsystemoperationanduninterrupted user experience. In environments where user activity changes rapidly such as shopping portals during sales, result announcement websites, or high-demand online services traffic prediction becomes essential for preventing server overloadandperformancedegradation.Traditionalreactiveapproachesstruggletohandlesuchdynamicpatternsbecause theycannotanticipateshort-termfluctuations,seasonalbehaviors,orsuddenchangescausedbyexternalfactors. Toaddressthesechallenges,advancedtime-seriesforecastingmodelsare widelyused. SARIMAX is a powerful statistical model capable of capturing trends, seasonality, and the suitableformodeling webtrafficorsystemusagethatfollows irregularbutrepetitivebehaviors
In contrast, Recurrent Neural Networks (RNNs) provide a machine-learning-driven approach to modeling complex nonlinear dependencies in time-series data. Because they can understand long-term patterns in data, they work well in situationswheremanyfactorsinfluencethetrendsandthebehaviorofthedatachangesalotovertime.
To analyses historical web traffic patterns and identify trends, seasonality, and nonlinear behaviours that influencesuddenspikesduringpeakusageperiods.
To implement the SARIMAX model forcapturinglinearrelationships, seasonal variations,and the impact of externalinfluencingfactorsintime-seriesdata.
To develop an RNN-based forecasting model capable of learning complex temporal dependencies and nonlinearpatternspresentindynamicwebtraffic.
To compare the forecasting performance ofSARIMAXandRNNmodelsacrossdifferenttimeintervals,such asshort-term,medium-term,andlong-termhorizons.

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 12 Issue: 12 | Dec 2025 www.irjet.net p-ISSN: 2395-0072
To design a hybrid prediction approach,integratingSARIMAXandRNN,toachievemoreaccurateandstable forecastingresultsthanindividualmodels.
The overall architecture of our system is designed to predict future web-traffic by learning patterns from past data. The processbeginsbygatheringpasttrafficdatafromtheserver,suchashowmanyusersvisitedthesiteeachhouroreachday. After gathering the data, it passes through the pre-processing stage, where it is cleaned, arranged in correct time order, andpreparedformodeltraining.
Once the data becomes usable, it is fed into two forecasting models: SARIMAX and Random Forest. SARIMAX works by studying time-based patterns such as trends, seasonality, and the effect of outside factors. It is helpful in understanding repeatedbehaviorslikeweeklypeaksorfestival-timespikes.RandomForest,onthe otherhand,learns bycreatingmany decision trees and combining their outputs. This helps it capture non-linear patterns and sudden changes in traffic that simplestatisticalmodelsmightmiss.
Themethodalsoinvolvessplittingthedatasetintotrainingandtestingportionssowecanseehowwellthemodelshandle data they haven’t seen before. SARIMAX is trained on the ordered time-series, while Random Forest is trained using selectedfeaturessuchaspasttrafficvalues,movingaverages,ortime-basedindicators.Oncethemodelsaretrained,each one delivers its own set of predictions. These outcomes are then evaluated to determine which model delivers the most accurateoutput.
By using both models together, the system can capture long- term trends as well as sudden changes. This helps service providersplanaheadforbusyperiodsandreducethechancesoftheplatformslowingdown.
Getting the data ready is a crucial part of predicting web traffic accurately because the raw time-series information is typically messy and not ready to use. The initial stage is to arrange all the traffic records in a proper and regular time order,suchashourlyordaily.Ifanyvaluesaremissingasaresultofserverissuesorloggingerrors,theyarefilledusing simplemethodslikeinterpolationorforwardfillingsothetimelinestayscomplete.Unusualspikesordrops oftencaused by bots, server downtime, or special events are identified and corrected so are also transformed or scaled to help the modellearnthepatternsmoreeasily.Seasonalbehaviors,likeweekenddipsorfestival-timeincreases,arealsounderstood sothemodelcanseparatenormalpatterns fromunexpectedchanges.Aftercleaningandorganizingeverything,thedatais divided into training and testing sets in the correct time order, helping the forecasting model learn precise and clear patterns.
In our research, we utilize two complementary approaches to tackle the problem. Estimating future outcomes of web traffic: SARIMAX and Random Forest. All models come with unique capabilities that help capture different patterns amongthedatapoints. SARIMAX isastatisticalmodeldesignedtoanalysesandforecasttimeseriesdatawithcleartrends andseasonal effects.Itis especiallyuseful for modeling regularpatterns,suchasdailyor weeklycyclesin web traffic, as well as the influence of external factors like holidays or special events. SARIMAX breaks down the time series into componentslike auto regression, moving averages, differencing to stabilize the data,and seasonal terms, whichtogether helpinmakingaccuratepredictionsovershortandmediumtimeframes.
Ontheotherhand, Random Forest isamachinelearningmodelthatsynthesizesnumerousdecisiontreesduringbuilding and combines their predictions to improve accuracy and reduce over fitting. Unlike SARIMAX, unlike some models, RandomForestdoesn’tpresumeanyspecificstatisticaldistributionandcanlearncomplex,non-linearrelationshipsinthe data.Thismakesitespeciallyeffectiveforcapturingsuddentrafficspikesorirregularpatternsthattraditionaltime-series modelsmightmiss.Bycombining these twomodels, weaimtoleverageboththe interpretabilityandseasonal strengths of SARIMAX and the flexibility and non- linear learning ability of Random Forest. This hybrid approach is expected to provide more reliable and accurate web traffic forecasts, helping service providers manage resources effectively during peakdemand.

International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 12 Issue: 12 | Dec 2025 www.irjet.net p-ISSN: 2395-0072
Afterselectingsuitableforecastingtechniques,thenextphaseistotraineachmodelonthecleanedandstructureddataset andthenassesshowaccuratelytheyperform.Thedatasetissplitintotwosections:thetrainingset,whichthemodelsuse tolearnunderlyingpatterns,andthetestingset,whichcheckshowwellthemodelscanpredictnew,unseendata.
Forthe SARIMAX model,trainingfocusesonidentifyingtheoptimalparametersthatrepresentthehistoricalbehaviorof thetime-series,includingseasonalityandanyexternalinfluencingfactors.Themodelstudiesprevioustrafficvaluesalong withadditionalinputssuchasholidaysorspecialeventstounderstandpatterns.Choosingthecorrectautoregressiveand seasonalordersisimportantandisusuallydoneusingstatisticaltestsandmodel-selectioncriteria.
For the Random Forest model, the training process begins with generating feature variables from the original timeseries.Thesefeaturesmayincludelaggedtrafficvalues,movingaverages,time-basedindicators(suchashour,weekday,or month), and any other relevant inputs. The Random Forest algorithm creates multiple decision trees using different subsetsofthetrainingdataandthencombinestheiroutputs,enablingittolearncomplex,non-linearrelationshipsinthe data.
Aftertrainingbothmodels,forecastsare producedforthetesting period.Thesepredictionsarecomparedwiththeactual observedtrafficvaluesusingevaluationmetricssuchas RMSE, MAE,and MAPE Theseerrormeasuresindicatehowclose theforecastsaretotherealvaluesandhelpidentifywhichmodeldeliversbetterperformance.
In our project, Users have several options to explore website traffic forecasts based on data from the past 60 days. The platform offers forecasts generated by both the Random Forest algorithm and the SARIMAX model enables straightforwardcomparisonofhowwelleachperforms.
A key feature of the project is the accuracy comparison graph, which visually displays how effectively each model formulatespredictionsregardingwebtraffic.Userscanhoveroverthegraphtoseedetailedtrafficpredictionsfromboth modelsatspecificpointsintime,makingitsimpletounderstandthedifferencesintheirforecasts.
Additionally, the system offers the option to download the 30day forecast data exported as a CSV file to allow deeper examinationorrecord-keeping.Thisdownloadablefilecontainsdetailedtrafficpredictionsforeachday,generatedbyboth models.Based on theaccuracycomparison, theRandomForestand SARIMAXmodelseachhavetheirstrengths.Random Foresttendstoperformbetterincapturingsuddenfluctuationsandnonlinearpatterns,whileSARIMAXexcelsat modeling regular seasonal trends. Overall, synthesizing the observations from both models offers a well-rounded and reliable forecastthatcanhelpwebsiteadministratorsmanagetrafficeffectivelyandprepareforpeakdemandperiods.


International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 12 Issue: 12 | Dec 2025 www.irjet.net p-ISSN: 2395-0072





International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 12 Issue: 12 | Dec 2025 www.irjet.net p-ISSN: 2395-0072
4.CONCLUSION
This project successfully uses SARIMAX and Random Forest models to forecast web traffic based on the past 30 days of data. The system helps website administrators predict future traffic, allowing better resource management during peak times.WhileSARIMAXcapturesseasonaltrendswell,RandomForesthandlessuddenchangeseffectively.Thecombination ofbothmodels,alongwithfeatureslikeaccuracycomparisonanddatadownload,makestheforecastingreliableanduserfriendly.
[1] Fatima, S. S. W. “A Review of Time-Series Forecasting Algorithms for Manufacturing / Industrial Applications” (MDPI,2024).
[2] Z. Liu, Z. Zhu, J. Gao, and C. Xu, ‘‘Forecast methods for time series data: A survey,’’ IEEE Access, vol. 9, pp. 91896 91912,2021.
[3] “AComparativeStudyofSARIMA,RandomForest,and…” InternationalJournal/IETA(Dec25,2024).
[4] Zenodo/Hossain “ComparativeAnalysisofARIMA,SARIMAXandRandomForestmodels”(2023/dataset:GDP example).
[5] AlSaleem,N.Y.A. “NetworkTrafficPredictionBasedonTimeSeriesModeling”(2023PDF/conferencearticle).
[6] Kaneetal. “ComparisonofARIMAandRandomForesttimeseriesmodels”(BMC/comparativestudy;background &methodology;classicreferenceupdated2020ssummaries).




ManuS.KisafinalyearComputerScienceandEngineeringstudent.Sheiscurrentlypursuingherdegreeat SGBalekundriInstituteofTechnology.
Renuka R.U is a final year Computer Science and Engineering student. She is currently pursuing her degreeatSGBalekundriInstituteofTechnology.
Ruchita R.P is a final year Computer Science and Engineering student. She is currently pursuing her degreeatSGBalekundriInstituteofTechnology.
Sadiya Md.D is a final year Computer Science and Engineering student. She is currently pursuing her degreeatSGBalekundriInstituteofTechnology.