LLM Agents

About Us

At ZharfaTech, we build the future of Generative AI: from Agentic and Conversational AI to Speech/Image technologies. our team tackles cutting-edge challenges using the latest tools and frameworks. Our mission is to push the boundaries of artificial intelligence, delivering scalable, high-performance solutions.

About the Role

As a Senior AI Engineer, you will architect and optimize AI solutions using modern frameworks, scalable deployment strategies, and end-to-end system design. You’ll work with tools like PyTorch, Hugging Face Transformers, and vector databases, ensuring robust, secure, and efficient AI deployment. We’re looking for someone who thrives in a fast-paced, innovation-driven environment and is passionate about building the next generation of AI systems.

What You'll Do

API & Backend Engineering: Build high-performance REST APIs using FastAPI, document endpoints with Swagger/OpenAPI, and secure them with TLS/SSL and auth workflows.
Deployment & Infrastructure: Containerize AI systems with Docker, orchestrate multi-service environments with Docker Compose, and deploy on AWS/GCP/Azure with CI/CD pipelines (GitHub Actions, Jenkins).
Model Optimization: Implement techniques like quantization (unsloth), pruning, and distributed training to maximize performance and scalability.
Database & Data Systems: Manage vector databases (Qdrant, pgvector) and relational databases (PostgreSQL) for AI applications.
AI/ML Development: Design and train state-of-the-art models (NLP, computer vision, generative AI) using PyTorch, Hugging Face Transformers, and TensorRT-LLM. Optimize inference with vLLM, AutoGPTQ, AutoAWQ, and FastEmbed for efficiency.
Innovation & Leadership: Experiment with emerging tools like CrewAI (multi-agent frameworks) and OpenHands (gesture recognition). Mentor junior engineers and collaborate with cross-functional teams to align AI solutions with business goals.

What You'll Bring

AI/ML Expertise: 5+ years of hands-on experience with Python, PyTorch, and Hugging Face Transformers. Deep understanding of model optimization (quantization, pruning, TensorRT-LLM, AutoAWQ).
Deployment & DevOps: Proficiency in Docker, CI/CD pipelines, and cloud platforms (AWS/GCP/Azure). Experience with vector databases (Qdrant, pgvector) and PostgreSQL.
API & Security: Strong knowledge of FastAPI, Swagger, REST API design, and security best practices (TLS/SSL, auth workflows).
Big Data & AI Workflow Tools: Familiarity with Airflow, MLFlow or LangFuse for data processing and orchestration.
Innovation Mindset: Passion for experimenting with emerging AI tools and frameworks (e.g., FastWhisper, CrewAI).
Leadership & Collaboration: Ability to mentor junior engineers, lead technical discussions, and work cross-functionally in a fast-paced environment.

Nice to Haves

Knowledge of multi-agent AI frameworks (CrewAI, Autogen, Dify, n8n).
Contributions to open-source AI projects or published research in AI/ML.
Familiarity with real-time inference systems and edge AI deployment.

Sign Up and Secure Your Spot!