Calcura – an innovative AI-powered handwritten problem solver

Page 1


International Research Journal of Engineering and Technology (IRJET)

Volume:12Issue:05|May2025 www.irjet.net

e-ISSN:2395-0056

p-ISSN:2395-0072

Calcura – an innovative AI-powered handwritten problem solver

Prof. Supriya C

Dept of ISE

Acharya institute of technology

Affiliated to VTU, Belagavi - 590018 Bengaluru – 560107

Akash Shetty B

Dept of ISE

Anjan P

Dept of ISE

Acharya institute of technology

Affiliated to VTU, Belagavi590018 Bengaluru – 560107

Acharya Institute of technology

Affiliated to VTU, Belagavi – 590018

Adithya N

Dept of ISE

Acharya institute of technology

Affiliated to VTU, Belagavi590018

Bengaluru – 560107

Chandan S

Dept of ISE

Acharya Institute of technology

Affiliated to VTU, Belagavi – 590018 Bengaluru – 560107 Bengaluru – 560107

Abstract This paper introduces Calcura, an AIpowered platform designed to simplify and enhance math and physics problem-solving through handwritten input recognition. By leveraging the Google Gemini API, Calcura accurately interprets user-drawn mathematical expressions and delivers solutions via an intuitive frontend and a robust backend. The frontend, developed using React and TypeScript, provides an interactive canvas for handwritten input, while the backend, built with FastAPI or Flask, processes images using Pillow and performs computations with NumPy and SymPy. Supporting operations ranging from basic arithmetic toadvancedcalculusandphysics,Calcura addresses challenges such as handwriting variability, real-time feedback, and efficient API integration. Initial evaluations demonstrate high accuracy in recognition and solution delivery. Designed for students, educators, and professionals, Calcura bridges traditional methods and digital innovation, providing an efficient and user-friendly platformformathematicalproblem-solving.

Introduction

Mathematics and physics problem-solving have been fundamental to education and research, requiring precision and analytical skills to address challenges acrossvariousdomains.Withtherapidadvancementsin technology, traditional problem-solving methods are transitioning towards innovative, digital solutions that improve efficiency and accessibility. Among these advancements, handwritten input recognition has emerged as a crucial technology, enabling users to interact with systems naturally, replicating the conventionalpen-and-paperapproach.

Calcura is an AI-powered platform designed to revolutionize the way users solve mathematical and physics-related problems by leveraging state-of-the-art artificial intelligence and intuitive user interfaces. The

system integrates advanced tools and frameworks to allowuserstoinputhandwrittenexpressionsandobtain accurate solutions in real time. Calcura addresses the growingdemandforintelligent,accessibleplatformsthat cater to students, educators, and professionals across disciplines.

At the core of Calcura is the Google Gemini API, which ensurespreciserecognitionofhandwrittenmathematical expressions.Thebackend, implementedusingFastAPIor Flask, processes input images using the Pillow library, while performing mathematical computations with robustlibraries likeNumPy andSymPy. Onthefrontend,a React-based interface with TypeScript provides an interactive canvas for input, delivering an intuitive and seamless user experience. This cross- platform solution ensures flexibility and accessibility for users on various devices.

This paper details the architecture of Calcura, emphasizing its ability to handle a wide range of computations, from basic arithmetic to advanced calculus and physics problems. The system overcomes key challenges, such as handwriting variation, real-time feedback delivery, and efficient API integration, to provideareliableandeffectiveplatform.

Calcura bridges the gap between traditional problemsolving methods and modern technological advancements. By offering an accessible, user-friendly platform,itenhancesuserengagementandproductivity. The subsequent sections of this paper outline the system’sdesign, key features, challenges addressed, and the potential impact of Calcura on mathematical problem-solvingandeducation.

Literature Review

Calcura,anAI-poweredplatformforhandwrittenmathand physics problem solving, addresses the limitations of traditional digital tools. This review examines relevant researchtocontextualizeCalcura'sdevelopment.

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Volume:12Issue:05|May2025 www.irjet.net p-ISSN:2395-0072

Handwriting Recognition:

Accurate handwriting recognition is fundamental. CNNs have proven effective in this area. Malakar et al. [1] and Majumderetal.[2]demonstrateCNNs' abilitytorecognize handwrittenwordsandperformwordspotting,relevantto Calcura's interpretation of mathematical symbols and equations. Research on multilingual handwriting [3] and text retrieval from forms [4, 5] further highlights the challenges and solutions in this domain. Singh et al. [6]'s work on recognizing legal amounts on checks showcases real-world applications. Preprocessing techniques like page segmentation [7] are also important. These studies emphasizerobustfeatureextractionandmodeltrainingfor highaccuracy.

Real-time Web Application Development:

Calcura's real-time nature requires efficient web development.Whilespecific citationsaremissing,Fast API andWebSocketsarecommonlyusedforscalable,real-time applications. Fast API enables efficient handling of user requests, while WebSockets facilitate low-latency communicationfora seamlessinteractiveexperience.

Frontend Development:

Calcura's user interface likely utilizes ReactJS for its component-based architecture, enabling dynamic and user-friendlyinterfaces.CSStechniquesensureresponsive andaestheticdesignacrossdevices.

Educational Technology:

Leahy et al. [8] discuss the impact of digital tools on education. Calcura aligns with this by bridging traditional and modern learning methods, offering a natural way to engage with math and physics concepts through handwritteninput.

Problem/Solution:

Existing tools often lack the intuitiveness of handwriting. Calcuraaddressesthisbyenablinghandwritteninput,realtime feedback, and a user-friendly interface, aiming to improveproductivityandaccessibility.

Conclusion:

Calcura's development leverages research in handwriting recognition,real-timewebdevelopment,andrelatedfields. UsingGeminiAPI,Python,Node.js,React,andothertools,it aims to provide a valuable platform. Its focus on intuitive interaction and cross-platform compatibility makes it a promising tool for enhancing learning. Future work can expand supported notations, refine recognition accuracy, andimproveuserexperience.

Methodology

System Architecture

Calcuraemploysaclient-serverarchitecture.Thefrontend, built with React.js and TypeScript, provides the user interface and captures handwritten input. The backend, implemented in Python using FastAPI or Flask, processes the input, performs handwriting recognition using a the Google Gemini API, and computes solutions using a dedicated Math Engine. Communication between the frontendandbackendis facilitatedthroughJSON.

Fig.1.1Architectureof Handwriting Recognition.

Calcura leverages the Google Gemini API for its handwritingrecognitioncapabilities.

1. Image Preprocessing:

User-drawn input from the canvas is captured as an image and sent to the backend. The backend preprocesses the image using Pillow (PIL). Preprocessing steps may include resizing, normalization,noisereduction,andbinarizationto optimizetheimagefortheGeminiAPI.

2. GeminiAPI Integration:

The preprocessed image is sent to the Google Gemini API. Gemini, utilizing advanced deep learning models (likely including CNNs or similar architectures), analyzes the image and returns the recognized mathematical expression in a structuredformat(e.g.,MathML).

3. Mathematical Computation

The recognized mathematical expression from the Gemini API is passed to the Math Engine, built using NumPy and SymPy. NumPy handles numerical computations, while SymPy performs symbolic manipulations. The Math Engine evaluatestheexpressionandreturnsthesolution.

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Volume:12Issue:05|May2025 www.irjet.net p-ISSN:2395-0072

4. Integration and Data Flow

The frontend captures user input and sends it to the backend as a JSON request. The backend preprocessestheimage,sendsittotheGeminiAPI for recognition, performs computations with the Math Engine, and sends the result back to the frontend in JSON format. The frontend then displaysthesolutiontotheuser.

Modules and Dataflow

ThissectiondescribesthekeymodulesofCalcuraandthe flowofdatathroughthesystem,providingadetailedview oftheinteractionbetweencomponents.

Frontend Module (React/TypeScript)

The frontend module is responsible for capturing user input and displaying results. It consists of the following sub-modules:

1. Drawing Canvas: This manages the user's drawing input. It captures drawing events, converts the drawing into a digital image (e.g., base64encodedorBlobformat),andtransmitsthe imagedatatothebackend.

2. UIComponent(Mantine/ShadCN): Thisprovides the user interface elements, including buttons, forms, and display areas for results. It leverages a UI component library for consistent design and rapiddevelopment.

3. Communication: Handlesall communication with thebackend.Itserializesrequests(includingimage data) and deserializes responses (JSON results) usingtechnologieslikefetchoraxios.

Backend Module(FastAPI/Flask)

The backend module performs the core processing of the handwrittenexpressions.Itssub-modulesinclude:

1. APIEndpoint: ThisdefinestheAPIendpointsthat the frontend interacts with. It receives image data from the frontend and routes ittotheappropriate processingmodules.

2. ImageProcessing(Pillow): Thispreprocessesthe received image to optimize it for the Gemini API. Preprocessing steps might include resizing, noise reduction,andformatconversion.

3. GeminiAPIIntegration: InteractswiththeGoogle Gemini API. It sends the processed image to the API for expression recognition and receives the results (recognized mathematical expression and/orsolution).

4. Mathematical Processing NumPy/SymPy):

Performs further mathematical processing if needed. If the Gemini API returns a recognized expression, this module can use NumPy (for numerical calculations) or SymPy (for symbolic manipulation) to simplify the expression, solve for variables, or perform other mathematical operations.

5. Response Generation Module: This module formats the results (recognized expression, solution, and any error messages) into a JSON objectandsendsitback tothefrontend.’

Dataflow

Thedataflowthroughthesystemfollowsthesesteps:

Fig.1.2 Dataflow diagram.

Dataflow diagram

1. User Input: The user draws a mathematical expressiononthefrontend'sdrawingcanvas.

2. Image Capture: The Drawing Canvas captures thedrawingasanimage.

3. Request Transmission: Theimagedataissendto thebackendmodule’sAPIEndpoint.

4. Image Preprocessing: The Image is Processed, noiseisremoved.

5. Gemini API Call: The Gemini API Integration Module sends the processed image to the Google GeminiAPI.

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Volume:12Issue:05|May2025 www.irjet.net p-ISSN:2395-0072

6. Gemini API Response: The Gemini API returns therecognizedexpressionand/orsolution.

7. Mathematical Processing (Optional): The Mathematical Processing Module performs additionalcalculationsifrequired.

8. Response Generation: The Response Generation Module creates a JSON response containing the results.

9. Response Transmission: The API Endpoint Module sends the JSON response back to the frontend.

10. Results Display: The UI Component Module displaystheresultstotheuser.

Results

For the Calcura app using the Gemini API, the input and output canbedefinedasfollows:

Input:

HandwrittenMathematicalExpressions:

Users can directly write mathematical expressions on thescreenusingtouch,stylus,ormouseinput. Examples:Arithmeticoperations,algebraicequations,or geometry-relatedquestions.

ObjectandShapeRecognition:

Users can draw shapes (like rectangles, circles, or triangles) and annotate them with dimensions (such as sidelengthsorradii).

The app recognizes shapes and contextualizes them for problem-solving(e.g.,areaorperimetercalculations).

Object-BasedQueries:

Userscansketchrecognizableobjectssuchasacar,tree, orball.

Theappinterpretsthesketchesandprovidescontextual information.

For example: Sketching a car and writing "speed = 60 km/h, distance = 180 km" will prompt the app to calculatethetimerequired.

Drawingaweightinfreefallandmentioning Earth's gravity can trigger the app to calculate time or force.

Output:

SolutionsforMathematicalExpressions:

Instant and accurate results for recognized equations andexpressions.

Shape-BasedCalculations:

Dynamic calculations based on recognized shapes and user annotations. For instance, drawing a rectangle and specifying side lengths will provide the area and perimeter.

ObjectRecognitionResults:

Identifies user-drawn objects (like "car," "tree") and offers meaningful interpretations, such as physics calculationsforspeed,time,orforce.

This approach makes Calcura versatile, transforming it into an advanced tool for educational and practical problem-solvingscenarios.

The development of this solution emphasizes faster processing,enablingquickinterpretationofhandwritten inputs and delivering prompt solutions. Its robustness ensures adaptability to diverse handwriting styles and complex notations. With features like multimodal capabilities, it lays the groundwork for future enhancements, such as image-based problem input. Real-time feedback provides instant solutions, and expanded notation support accommodates a broader range of mathematical symbols. The intuitive interface offers a natural and user-friendly handwriting input method, fostering active learning and engagement with mathematical concepts. Overall, this innovative solution transforms math problem-solving, enhancing learning outcomes in subjects like math and physics while boosting productivity for students, educators, and professionals.

International Research Journal of Engineering and Technology (IRJET) e-ISSN:2395-0056

Volume:12Issue:05|May2025 www.irjet.net p-ISSN:2395-0072

Conclusion

Calcura demonstrates a promising and innovative approach to the complex problem of interpreting and solving handwritten mathematical expressions. The system effectively integrates a user-friendly frontend, featuring an intuitive drawing canvas for direct input, witharobustbackendprocessingpipeline.Akeystrength ofCalcura lies inits leveragingoftheGoogleGeminiAPI, enabling the recognition of intricate mathematical notations.Thecrucialpreprocessingstage,employingthe PythonImagingLibrary(Pillow),optimizesimagequality to enhance the accuracy and effectiveness of the Gemini API. Moreover, the backend harnesses the power of NumPy and SymPy to execute precise numerical and symboliccomputations.Efficientcommunicationbetween the frontend and backend, facilitated by JSON, ensures a responsive and interactive user experience. This architecture highlights the significant potential of combining cutting-edge APIs with established mathematical libraries to develop practical and powerful toolsformathematicalproblem-solving.

Future Enhancements

Several avenues exist for extending Calcura's capabilities andimprovingitsperformance.Theseinclude:

1.Enhanced Preprocessing: Further research into preprocessing techniques could yield significant improvements in recognition accuracy. Exploring methods like adaptive thresholding, noise reduction tailoredtohandwritteninput,andskewcorrectioncould be beneficial. Evaluating the impact of these techniques onGeminiAPIperformanceiscrucial.

2.Contextual Understanding: Incorporating contextual understanding of mathematical expressions could improve recognition and interpretation. For instance, recognizing the context of a problem (e.g., algebra, calculus) could help disambiguate similar-looking symbolsorexpressions.

3.AccessibilityFeatures:Enhancingaccessibilityforusers with disabilities is essential. This could include features likescreenreadercompatibility,keyboardnavigation,and alternativeinputmethods.

4.IntegrationwithLearningPlatforms:IntegratingCalcura withonlinelearning platforms oreducational tools could make it a valuable resource for students and educators. This could involve developing APIs for seamless data exchange and incorporating features for assessment and feedback.

References

[1] Malakar S, Ghosh M, Sarkar R, Nasipuri M (2020) Development of a Two-Stage Segmentation-Based Word Searching Method for Handwritten Document Images. J IntellSyst 29:. doi:10.1515/jisys-2017-0384

[2] Majumder S, Ghosh S, Malakar S, et al (2021) A votingbased technique for word spotting in handwritten document images. Multimed Tools Appl preprint:1–24 . doi: https://doi.org/10.1007/s11042-020-10363-0

[3] Pal U, Roy RK, Kimura F (2012) Multi-lingual city name recognitionfor Indianpostal automation. In:Proceedings of the International Workshop on Frontiers in Handwriting Recognition. IEEE,pp169–173

[4] Ghosh S, Bhattacharya R, Majhi S, et al (2018) Textual Content Retrieval from Filledin Form Images. In: Proceedings of the Workshop on Document Analysis and Recognition. Springer,pp 27–37

[5] Bhattacharya R, Malakar S, Ghosh S, et al (2020) Understanding contents of filled-in Bangla form images. MultimedTools Appl 80:3529–3570

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.