Resources / Case Studies
Intelligent Document Processing with AWS Textract
Our client is a Boston-based single-family office that seeks to automate and speed up the processing and review of tax documents. They came to JetSweep to build an Intelligent Document Processing Platform on AWS.
The Customer
The customer provides a broad suite of services, including investment management, philanthropic strategy, accounting, legal coordination, real estate and property oversight, and human resources support. With approximately 200 employees, the customer is a successful non-profit foundation that has impacted thousands of organizations through investment and program development.
The Challenge
The customer faces a significant challenge in efficiently processing K-1, K-3, and PFIC-related tax documents. These forms contain detailed partnership and foreign investment data that must be manually reviewed and entered into internal systems which is a time-consuming, error-prone task that limits scalability and consistency.
The key business challenge lies in automating the extraction of structured data to support faster, more accurate financial reporting and reduce operational burden. Failure to address this challenge poses risks including increased costs from manual labor, potential errors in tax reporting, and the inability to scale document processing as data volume grows.
This project directly aligns with the goals of the program by leveraging AI-powered solutions including a custom AWS Lambda function to streamline document workflows, ensure data integrity, and set the foundation for broader AI integration across their operations.
Why AWS?
This customer chose to use AWS for this project in order to leverage its AI infrastructure and offerings and expert partner ecosystem. They have been working with AWS previously and wanted to maintain that relationship after proven success with efficiency and scalability.
Why JetSweep?
As an existing customer, we have worked with them on security infrastructure, cost optimization, and migration engagements in the past. They chose us for this project due to our proven track record and whiteglove service we provide to our customers.
The Solution
This project focused on designing and deploying a secure, scalable document-processing pipeline using AWS-native services. We partnered with this customer to build a production-ready architecture capable of ingesting, analyzing, and enriching high volumes of documents with minimal operational overhead. The solution emphasizes automation, asynchronous processing, and extensibility, enabling the platform to grow alongside evolving data and integration needs.
We implemented an event-driven workflow that automatically initiates OCR and document analysis upon ingestion, ensuring reliable processing at scale while maintaining cost efficiency. The architecture also prepares the system for downstream AI enrichment and third-party integrations, creating a flexible foundation for future enhancements.
We built the following in AWS:
Amazon S3 for secure document ingestion and storage
AWS Lambda to orchestrate OCR workflows and process results
Amazon Textract for asynchronous document analysis and OCR
Amazon SNS for job completion notifications and decoupled processing
Amazon Bedrock for generative AI post-processing and document intelligence
External API integration (Bipsync) to synchronize enriched data with downstream systems
The Benefits
Previously, manual K-1 data extraction required approximately 15-30 minutes per document for trained staff to review, extract key values, validate data, and enter into Bipsync. Our goal was to reduce processing time by 80% through automation, and the successfully automated pipeline processes documents in under 2 minutes from upload to Bipsync record update This is representative of a ~90% reduction in processing time.
Additionally, manual data entry historically resulted in an estimated 2-5% error rate due to transcription mistakes and oversight of complex multi-part forms. Our goal was to achieve >=95% accuracy on extracted K-1 data fields, and automated extraction with Bedrock validation achieves >95% accuracy on standard K-1 fields, with all documents flagged for human validation review before final processing.
Next Steps
The customer plans to maintain a working relationship with JetSweep for other projects moving forward to improve efficiency and accuracy of reviewing tax documents with the help of AWS AI services.
About JetSweep
JetSweep is an Advanced AWS Consulting Partner and Managed Services Provider. Our team of solution architects works with customers to optimize their experience on the cloud and transform their IT capabilities. Our consulting services are focused to help customers visualize and prepare for the challenges in the changing technology landscape and provide clear and concise solutions. Our team specializes in disaster recovery, end-user computing, web application development, data analytics, Generative AI, cost optimization, migration, and modernization.