About me

Hello, I'm Anirudha Joshi, a Data Engineer and Software Developer with a Master of Science in Information Systems from Northeastern University, expected to graduate in August 2024. I am skilled in a range of programming languages including Python, Java, and T-SQL, and familiar with various tools such as Azure Data Studio, Snowflake, AWS S3, PowerBI, and Tableau. My experience encompasses building machine learning models, enhancing data pipeline efficiencies, and delivering insights through advanced analytics.

In my professional tenure at Karyopharm Therapeutics as a Data Analytics Intern, I developed machine learning models that notably increased sales revenue and designed a demand forecasting dashboard with high accuracy. During my time at Deloitte and Cognizant as a Data Engineer, I engineered data systems for enterprise-level transformation, improved data processing times by 75%, and automated daily activities by 20%.

My projects include developing a PDF Explorer and Image Matcher platform using FastAPI, Langchain, and Snowflake, and leading a Twitter Sentiment Analysis project to extract insights from NBA tweets. My hands-on experience with big data tools such as Hadoop, Kafka, and Scikit-Learn, combined with my proficiency in cloud services and ETL processes, enables me to contribute effectively to any data-driven project.

Outside of my technical work, I engage with creative projects and enjoy exploring the nuances of data through various lenses. I am eager to take on new challenges and opportunities where I can apply my blend of data engineering and analytics skills to deliver impactful solutions. If you're looking for a passionate and skilled data engineer to join your team, let's connect and discuss potential collaborations.

Resume

Education

  1. Northeastern University Boston, Massachusetts, US

    Masters of Information Systems Sep, 2022 - August, 2024

    Application Engineering Development, Data Science Engineering Methods and Tools, Big Data Systems and Intelligent Analytics, Database Management and Database Design

  2. Manipal University Jaipur Jaipur, Rajasthan, India

    B.Tech in Electronics and Communication Engineering Aug, 2014 - May, 2018

    Data Structures and Algorithms, Digital Image Processing, Embedded Systems, Computer Architecture, Computer Networks, Signal Processing

Experience

  1. Karyopharm Therapeutics Inc Newton, Massachusetts, US

    Data Analytics Intern June, 2023 - December, 2023
    • Spearheaded the creation of an in-house medical claims analytics, extracting data from Snowflake dbt to AWS Redshift using Python and transforming it into actionable insights through Power BI dashboards, thereby saving $1 million yearly

    • Developed machine learning models to optimize targeting of doctors as vaccine prescribers, boosting sales revenue by 14%

    • Achieved a 95% accuracy rate in predicting future cancer vaccine prescribers for multiple myeloma among doctors using advanced machine learning techniques and correlation-based feature selection, enabling data-driven decision-making

    • Architected and implemented real-time data pipelines leveraging AWS Kinesis and AWS EMR for data analysis and reporting

    • Identified data dimensions in current systems and performed data modeling for inventory and patient modules based on business rules from the commercial marketing team, ensuring PII data encryption as per data governance rules

    • Integrated DevOps practices into the deployment pipeline using Jenkins for REST API and automated testing following Agile

  2. Deloitte Mumbai, Maharashtra, India

    Sr Data Engineer November, 2021 - May, 2022
    • Engineered an incremental Change Data Capture (CDC) system to replace the time-consuming legacy BI system, achieving a 97.6% accuracy rate and efficiently visualized on Salesforce CRM Lightning dashboards at an enterprise-level

    • Led the transition to AWS Cloud, migrating legacy systems to AWS S3, Redshift, and EC2 for real-time and batch processing

    • Consolidated 40+ KPIs of CDC system by optimizing the data pipeline for sales and marketing analytics dashboards

    • Revolutionized data processing efficiency for an extensive 50 GB of both structured and unstructured data, ensuring data integrity through the strategic implementation of ETL using Azure DataFactory, Spark using Agile Scrum methodology

    • Cultivated a high-performing team of 3 data engineers to enhance data accuracy between BI and legacy systems

    • Implemented CI/CD pipelines using Git and Terraform to automate ML code deployment across cloud environments

    • Orchestrated data pipelines integrating data from 60 SAP-BI reports into Salesforce CRM, leveraging Azure DataFactory

  3. Cognizant Pune, Maharashtra, India

    Data Engineer November, 2018 - November, 2021
    • Gathered requirements through effective stakeholder communication, ensuring alignment across cross-functional teams

    • Directed 50+ data flows for various KPIs like sales, inventory, and orders by leveraging ETL from source legacy systems

    • Built a high-visibility sales dashboard for the executive team, ensuring precise data validation and focusing on sales metrics

    • Spearheaded a module on sales incentive calculation leading to a direct impact on employee payouts close to $2 million

    • Programmed and elevated transactional and master data workflows in Informatica, seamlessly integrated with Azure Data Warehouse, resulting in an exceptional 75% reduction in batch processing times—from 4 hours to 1 hour

    • Initiated transformative automation projects, orchestrating the capabilities of Azure Databricks, SQL, Azure DataFactory, and PowerShell scripts, resulting in a substantial 20% reduction in daily manual activities

    • Orchestrated and managed a team of 4 to create snowflake schematic data models and process ETL for billing KPI visualized in Power-BI

Projects

Contact

Contact Form