Transcript operators
InfoSphere Streams for Real Time Analytics in Financial Services Industry
Krishna Mamidipaka, krishnag@us.ibm.com
Roger Rea, rrea@us.ibm.com
Housekeeping
•
We value your feedback - don't forget to complete your evaluation for each session you attend and hand it to the room monitors at the end of each session
•
Overall Conference Evaluation will be provided at the General Session on Friday
•
Visit the Expo Solutions Centre
•
Please remember this is a 'non-smoking' venue!
•
Please switch off your mobile phones
•
Please remember to wear your badge at all times
Disclaimer
The Information regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or functionality. Information about potential future products may not be incorporated into any contract. The development, release, and timing of any future features or functionality described for our products remains at our sole discretion.
Agenda
•
Financial Markets Business Challenges
•
Industry Technical Challenges
•
InfoSphere Streams
•
Trend Calculator
•
Financial Toolkit
•
Data Mining in Real Time
•
InfoSphere Streams Directions
4
Firms Must Capitalize on Drivers of Change
Drivers
Markets becoming electronic
Implications
Speed as source of Alpha
Actions
Accelerate the end-to-end marketplace connectivity and execution Real-time data pressures Volume is a barrier Information availability Transaction costs pressures Transparency is required Detailed analysis of trading process Increase capacity to handle current and forecasted volumes Store, retrieve and distribute comprehensive time series data in a timely manner Access to broader markets by accessing multiple markets
5
Real time data pressures
We are in a technology arms race Latency reductions with a clear business value or cost associated Exponential increases in volumes
For US equity electronic trading brokerage 1 millisecond = $4M in annual revenue Source: Tabb Group
6
The Volume, Complexity & Semantic Depth of data that to be analysed will increase significantly
Structured data Structured & Unstructured data
Historical Trade Data Risk Analytics Data Market Data
Analytics & Insight
Tomorrow?
Risk Analytics Data Historical Trade Data Market Data Internal Message Bus Government Statistics Real World Sensors Blogs & Commentary
Analytics & Insight
Video News Feeds Corporate Press Reports Weather Data Web Pages RSS Feeds + Other Feeds
Information overload
7
The Transaction Life Cycle or latency loop – end to end latency is the key to success and there are no prizes for coming second
Investment / trading goals Transaction Cost Analysis latency measurement is a competitive advantage to deliver Alpha Market Data
WAN Connectivity
Trading Decision What to Buy/Sell
Middleware
Execution Algorithm VWAP
CEP Engines
Order Routing Decision
OMS/EMS
Matching
Exchanges
,
Speed
Current approaches reaching limits, based on x86 and networking technologies
8
The Manycore programming challenge
Programmers cannot cope with thousands of threads and complex data flows using existing programming models I/O NET DSK DSK RAM I/O CPU
Single Core Single Thread 100% Serial Programming Yesterday
I/O NET DSK RAM Core RAM Core RAM Core RAM Core RAM Core RAM Core RAM Core RAM Core RAM Core RAM Core RAM Core RAM Core RAM Core RAM Core RAM Core RAM Core
Multicore (2-16) Multithread (10s) 80/20 Serial/Parallel Programming Today Manycore (32-100s) 20/80 Serial/Parallel Programming Threading model breaks as complexity exceeds programmer capability Tomorrow 9
Options for exposing parallelism in a programming model
Parallelism Fully Exposed Partial Exposure Parallelism Implicit
Full exposure of machine details Only usable by experts High performance Low productivity Limits exposure to machine details Expands programmer community High performance Higher productivity for C/C++ class programmers - Bounds checks, pointer checks, strong typing, etc.
No exposure of machine details, e.g., Hadoop/map reduce,
IBM Streams Processing Language
Usable by larger number of programmers High Performance High Productivity
10
Time is ripe for a new era of computing
•
Emerging trends create need for new languages
– – – – – Scientific programming Fortran Business programming Cobol Systems programming at higher level C Increased productivity C++ Web programming Java
•
Streaming data sources and multicore architectures
– Streams Processing Language
11
Delivering ‘Continuous Intelligence’ with Powerful Analytics
Automated Options Market Making:
–
Peak throughput of 10 million messages per second
–
Mean latency under 100 micro seconds across 28 dual quad core x86 blades
Real time delivery Powerful Analytics Millions of events per second Traditional / Non-traditional data sources Microsecond Latency
12
IBM InfoSphere Streams v1.2
Development Environment Runtime Environment Toolkits & Adapters
Eclipse IDE StreamSight Stream Debugger RHEL v5.3 or v5.4
x86 multicore hardware InfiniBand support Up to 125 servers
Front Office 3.0
Connectors to data sources Operator Library Financial Toolkit Mining Toolkit
13
Scalable stream processing
• InfoSphere Streams provides – A programming model and IDE for defining
data sources
and software analytic modules called
operators
that are fused into process execution units
(PEs)
– infrastructure to support the composition of scalable
stream processing applications
from these components – deployment and operation of these applications across distributed
x86 processing nodes
, when scaled processing is required – stream connectivity between data sources and PEs of a stream processing application
14
Trend Calculator Example
Symbols to be output
Trend File 1 playback Trend File 2 playback Trend File 3 playback
Algo Parameters Per Symbol
Up/down trend for Requested symbols
15
Streams offers tremendous deployment flexibility
With only a simple re-compile of application:
All on one machine fused into one multi-threaded process All on one machine; each operator in its own process Each operator in its own process, each process on its own machine 16
Trend Calculator Example
17
Financial Services Toolkit
Speeds development of Streams financial domain applications • • • Adapters layer used by top two layers and user-written apps Functions layer used by top layer and user-written apps Solution Frameworks are “starter” applications that target a particular use case
18
Adapters, Functions, Utilities
• • • • • Financial Information Exchange (FIX) Adapters – fixInitiator Operator, fixAcceptor Operator, FixMessageToStream Operator, StreamToFixMessage Operator WebSphere Front Office for Financial Markets (WFO) Adapters – WFOSource Operator, WFOSink Operator WebSphere MQ Low-Latency Messaging (LLM) Adapters – MQRmmSink Operator Functions: – Coefficient of Correlation – “The Greeks” (Put/Call values, Delta, Theta, Rho, Charm, DualDelta, etc.) Operators: – Wrappering QuantLib financial analytics open source package.
– Provides operators to compute theoretical value of an option: • EuropeanOptionValue Operator – 11 different analytic pricing engines – e.g. Black Scholes, Integral, Finite Differences, Binomial, Monte Carlo, etc.
• AmericanOptionValue Operator - 11 different analytic pricing engines – e.g. Barone Adesi Whaley, Bjerksund Stensland, Additive Equiprobabilities, etc.
19
Equities Trading “Starter Application”
Modular design Components are plug-replaceable – extend these or substitute your own Demonstrates how trading strategies may be swapped out at runtime, without stopping the rest of the application
TradingStrategy
module looks for opportunities that have specific quality values and trends
OpportunityFinder
module looks for opportunities and computes quality metrics
SimpleVWAPCalculator
module computes a running volume-weighted average price metric
20
Options Trading “Starter Application”
DataSources
module consumes incoming data; formats and maps for later use
Pricing
module computes theoretical put and call values
Decision
module matches theoretical values against incoming market values to identify buying opportunities Option Price Stock Price Stock Information Risk Free Rate DataSources Data Filtering and Preparation OptionsValue Decision Identification of Buying Opportunities Pricing Stock RiskFreeRate OptionsPriceFeedData Theoretical Price Computation Data Sinks
21
Multinational Mutual Funds Manager and Broker
•
High speed market trend calculation system that can provide instant insights into the market behavior
•
Improved development time from days to hours to add new features to the trend calculation system using the Streams programming model
•
Customizable to run on one server or distributed across many servers to garner more compute power
•
Visualization tools for effective live trade monitoring and risk assessment 22
making
Transforming the Information Supply Chain to reduce the time to action!
Elapsed Time to Action Analytical Modeling & Information Dashboards Planning Scorecarding Operational Reports Bus Process & Event Mgmt Reports Ad-hoc Queries SOURCES WAREHOUSE DATA INTEGRATION OPERATIONAL DATA STORES DATAMARTS
23
Stream Computing:
Analytical Modeling & Information
Reduces Time to Action Widens the aperture Reduces costs
Time to Action Operational Reports Bus Process & Event Mgmt Analytical Modeling & Information Dashboards Planning Scorecarding Reports Ad-hoc Queries
More context
SOURCES WAREHOUSE DATA INTEGRATION OPERATIONAL DATA STORES DATAMARTS
24
Market Surveillance & Fraud applications
Solution User Interface
Real time analysis processing
Solution User Interface
Rule Parameters Alerts Market Feeds and Trade Data
Enrich ment Existing business rules Additional sophisticated analytics Collected results
Historical
PMML Model Scoring 25
What are key advantages of Streams?
Language built for Streaming applications:
•
Reusable operators
•
Rapid application development
•
Continuous “pipeline” processing
Compiling groups of operators into single processes enables:
•
Efficient use of cores
•
Distributed execution
•
Very fast data exchange
• •
Can be automatic or tuned Can be scaled with the push of a button
Use the data that gives you a competitive advantage:
•
Can handle virtually any data type
•
Use data that is too expensive and time sensitive for other approaches
Easy to extend:
•
Built in adaptors
•
Extend with C++ and Java
•
Extend running applications
Extremely flexible and high performance transport:
•
Very low latency
•
High data rates
26
IBM InfoSphere Streams directions
Tools
Streams Studio enhancements Video/audio analytics Text/unstructured analytics Streams Processing Language improvements Native XML support
Runtime
High Availability Expanded platform support Performance improvements
Adapters
WebSphere MQ RSS feeds Mashup Hub WebSphere Business Events Oracle SQL Server MySQL
Cognos 8BI
Millions of events per second
WebSphere Business Events InfoSphere Warehouse Data in motion Front Office IBM Mashup Hub
Millisecond Latency
Existing business information
All statements regarding IBM's plans, directions, and intent are subject to change or withdrawal without notice. Any reliance on these statements are at the relying party's sole risk and will not create any liability or obligation for IBM.
27
InfoSphere Streams sessions
Time
Thursday May 20 10:45 AM - 11:35 AM Friday May 21 09:00 AM – 09:50 AM
Session Title
3666A InfoSphere Streams for Real Time Analytics in Financial Services Industry 3661A 3692A InfoSphere Streams helps Stockholm build Ver 2.0 Traffic Control System InfoSphere Streams at Marine Institute of Ireland: Deep Dive Friday May 21 11:30 AM - 12:30 PM Wednesday 10AM - 6PM Thursday 10AM - 5PM Friday 9AM - 2PM Wednesday 10:30 – 11:30 Thursday 12:30 – 13:00 Thursday 16:30 – 17:00 Demo Room
Location
Marriott Park Hotel, Room 14 Marriott Park Hotel, Room 13 Marriott Park Hotel, IOD Mini Theatre 3 InfoSphere Streams Demonstrations Marriott Park Hotel, IOD Demo Room Station 19 Mini Theater on Expo Floor InfoSphere Streams in Telco InfoSphere Streams Business Insight Leverage Warehouse, SPSS with Streams Marriott Park Hotel, InfoSphere Mini Theater Expo Floor