RoleLarge scale data analytics
TechnologyJAVA, PHP – Hadoop : HDFS, MapReduce, HBase, Hive – Membase – Memcache – MySQL and more.
Buy, measure and optimize advertising spend
A leading independent software platform that enables brands and agencies to plan, buy, measure and optimize their global advertising wanted to improve transparency by leveraging real-time user data across television, video, display, mobile, social and other brand advertising initiatives. This requires an ability to process information to identify trends and gain insights quickly.
Home grown workflow engine for troubleshooting
We began this project when the Hadoop ecosystem was still in infancy. Every visitor was identified as new or returning from the http queries during batch processing Hadoop jobs. Oozie was still in its early stage and there was a homegrown workflow engine for troubleshooting that ran into distributed computing issues in lock management and scheduling. The project involved managing 500 servers spread over 4 regions. In the performance troubleshooting of serialization and compression layers, the Ints, Longs and Collections in Java introduce significant size overheads unless handled at job design.
Stability and performance improved
As the stability and performance issues were solved, engagement reports were published at higher frequencies, micro trends could be measured and used in targeting. For example, a tweet can cause one of the topics to be more searched that makes some inventories hotter than others which is important information for ad targeting.
If the same information is available at a later period, leveraging it may not be as effective as the trends may be short-lived. This translated into customer success with some of the key adopters of the platform especially in the impact and reach reports for advertisers.