/ Technologies / Hadoop / team resume /

Pythian’s Hadoop Team Resume

Each individual on Pythian’s big data team brings passion, insight and knowledge. As a team, that collective wisdom and vision have put Pythian at the forefront of the emerging big data market.  Our top-calibre team comprises sought-after speakers, published authors and frequent bloggers, who’ve never met a challenge they couldn’t solve. Team Profile:

  • Big data consultants
  • Solutions architects
  • Big data developers


Skills and experience

  • Big data platform recommendations and deployment—from hardware and software recommendations to complete deployments
  • Cloud deployments: Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform, Rackspace
  • Performance tuning to support SLA of response times, scale and real-time processing SLAs
  • Machine Learning and Deep Learning
  • High availability: design and implement an optimal HA architecture to support SLAs
  • Security: requirements analysis, design and implementation
  • Data modeling
  • Data ingestion, transformations and ETL
  • Real-time events processing
  • Business Intelligence: integration and optimization for BI use cases

Areas of expertise

  • Hadoop distributions: Cloudera, MapR, Hortonworks, Amazon EMR
  • Hadoop ecosystem: Apache Hive, YARN, Apache Pig, Apache Hbase, Apache Oozie, Azkaban, Apache Mahout, Apache ZooKeeper, Apache Spark and more
  • Hadoop security: Kerberos, Apache LDAP, Active Directory, encryption
  • BI tools/visualization: Platfora, Tableau and more
  • Cloudera technologies: Cloudera Impala, Cloudera Search, Apache Sentry, Cloudera Manager
  • NoSQL: Apache HBase, Apache Cassandra, MongoDB, Couchbase
  • Data ingestion: Apache Kafka, Apache Flume, Apache Sqoop
  • Complex event processing: Apache Storm, Spark Streaming
  • Search engines: Apache Solr, Elasticsearch
  • Cloud: AWS, Microsoft Azure, Google Cloud Platform
  • AWS tools: RedShift, DynamoDB, RDS, Kinesis, Data Pipeline, EMR, SQS, SNS, etc.
  • Google Cloud Platform: BigQuery, Dataflow, Compute Engine
  • Azure Machine Learning platform
  • Machine-learning products: Spark MLlib, Mahout, GraphLab, R, Python ecosystem, Scikit Learn
  • Deep Learning: Neural Networks,, Computer Vision, Natural Language Processing

Programming languages

  • Python
  • Java
  • C++/C
  • JavaScript
  • Scala
  • Ruby


  • Hortonworks Certified Developer
  • Cloudera Certified Administrator for Apache Hadoop
  • Cloudera Certified Developer for Apache Hadoop
  • MapR Certified Administrator
  • Certified Google Cloud Developer

Industry Recognition

  • Cloudera Champion of Big Data
  • Oracle ACE Director
  • Author of Hadoop Cluster Deployment
  • Finalist for Data Impact Award in Business Impact category
  • Winners of Cloudera Impala Hackathon

Learn more about our Hadoop expertise and services