CS 290C: Formal Models for Web Software Lecture 9: Analyzing Data Models Using Alloy Analyzer and SMT-Solvers Instructor: Tevfik Bultan.
Download ReportTranscript CS 290C: Formal Models for Web Software Lecture 9: Analyzing Data Models Using Alloy Analyzer and SMT-Solvers Instructor: Tevfik Bultan.
CS 290C: Formal Models for Web Software Lecture 9: Analyzing Data Models Using Alloy Analyzer and SMT-Solvers Instructor: Tevfik Bultan
Three-Tier Architecture
Browser Web Server Backend Database
Three-Tier Arch. + MVC Pattern
Browser Controller Model
•
MVC pattern has become the standard way to structure web applications:
Views Web Server
• • • • • •
Ruby on Rails Zend for PHP CakePHP Struts for Java Django for Python …
Backend Database
Benefits of the MVC-Architecture
• Benefits of the MVC architecture: • Separation of concerns • Modularity • Abstraction • These are the basic principles of software design • Can we exploit these principles for analysis?
A Data Model Verification Approach
MVC Application • Ruby on Rails Data Model • ActiveRecords
MVC Design Principles Automatic Extraction Add data model properties
Formal Model • Alloy Bounded Verification • Alloy Analyzer
Rails Data Models
• Data model verification: Analyzing the associations/relations between data objects • Specified in Rails using association declarations inside the ActiveRecord files – The basic relation types • One-to-one • One-to-many • Many-to-many – Extensions to the basic relations using Options • :through, :conditions, :polymorphic, :dependent
The Three Basic Relations in Rails
• One-to-One (One-to-ZeroOrOne) class User < ActiveRecord::Base has_one :account .
end class Account < ActiveRecord::Base belongs_to :user end .
• One-to-Many class User < ActiveRecord::Base has_many :projects end .
class Project < ActiveRecord::Base belongs_to :user end
User 1 0..1
Account User 1 * Project
The Three Basic Relations in Rails
• Many-to-Many class Author < ActiveRecord::Base has_and_belongs_to_many :books end class Book < ActiveRecord::Base has_and_belongs_to_many :authors end
Author * * Book
Options to Extend the Basic Relations
• :through Option – To express transitive relations, or – To express a many-to-many relation using a join model as opposed to a join table • :conditions Option – To relate a subset of objects to another class • :polymorphic Option – To express polymorphic relations • :dependent Option – On delete, this option expresses whether to delete the associated objects or not
The :through Option
class Book < ActiveRecord::Base has_many :editions belongs_to :author end class Author < ActiveRecord::Base has_many :books has_many :editions, :through => :books end class Edition < ActiveRecord::Base belongs_to :book end
* Book 1 Author 1 1 * * Edition
The :conditions Option
class Account < ActiveRecord::Base has_one :address, :conditions => “address_type=‘Billing” end .
class Address < ActiveRecord::Base belongs_to :account end
Account Address address_type= ‘Billing’
The :polymorphic Option
class Address < ActiveRecord::Base belongs_to :addressable, :polymorphic => true end class Account < ActiveRecord::Base has_one :address, :as => :addressable end class Contact < ActiveRecord::Base has_one :address, :as => :addressable end
Account Address Contact
The :dependent Option
class User < ActiveRecord::Base has_many :contacts, :dependent => :destroy end class Contact < ActiveRecord::Base belongs_to :user has_one :address, :dependent => :destroy end
User 1 * Contact 1 0..1
Address
• :delete directly deletes the associated objects without looking at its dependencies • :destroy first checks whether the associated objects themselves have associations with the :dependent option set
Formalizing Rails Semantics
Formal data model: M = • S: The sets and relations of the data model (data model schema) – e.g. {Account, Address, Project, User} and the relations between them • C: Constraints on the relations – Cardinality constraints, transitive relations, conditional relations, polymorphic relations • D: Dependency constraints express conditions on two consecutive instances of a relation such that deletion of an object from the fist instance leads to the other instance
Formalizing Rails Semantics
• Data model instance:
I
=
where
O
= {
o
1
, o
2
, . . . o
n
} is a set of object classes and
R
object relations and for each
r
i
such that
r
i
o
j
×
o
k
= {
r
1
, r
2
, . . . r
m
} is a set of ϵ
R
there exists
o
j
, o
k
ϵ
O
•
I
=
is an
instance
of the data model
M
=
,
denoted by
I
|=
M,
if and only if 1.
the sets in
S
, and
O
and the relations in
R
follow the schema 2.
R
|=
C
Formalizing Rails Semantics
• Given a pair of data model instances
I
=
and
I ’
,
(I, I ’)
is a
behavior
of the data model
M
= =
, denoted by
(I, I ’)
|=
M,
if and only if 1.
O
and
R
and
O ’
and
R ’
follow the schema
S
2.
3.
R
|=
C
and
R ’
|=
C
, and (
R,R
’) |=
D
Data Model Properties
Given a data model
M
=
, we define four types of properties:
1. state assertions
(A S ): properties that we expect to hold for each instance of the data model
2. behavior assertions
(A B ): properties that we expect to hold for each pair of instances that form a behavior of the data model
3. state predicates
(P S ): predicates we expect to hold in some instance of the data model
4. behavior predicates
(P B ): predicates we expect to hold in some pair of instances that form a behavior of the data model
Data Model Properties
Data Model Verification
•
The data model verification problem
: Given a data model property, determine if the data model satisfies the property.
• An enumerative (i.e., explicit state) search technique not likely to be efficient for bounded verification • We can use SAT-based bounded verification!
– Main idea: translate the verification query to a Boolean SAT instance and then use a SAT solver to search the state space
Data Model Verification
• SAT-based bounded verification: This is exactly what the Alloy Analyzer does!
• Alloy language allows specification of objects and relations, and the specification of constraints on relations using first order logic • In order to do bounded verification of Rails data models, automatically translate the Active Record specifications to Alloy specifications
Translation to Alloy
RAILS: class ObjectA has_one :objectB end .
class ObjectA has_many :objectBs end .
class ObjectA belongs_to :objectB end .
class ObjectA has_and_belongs_to_many :objectBs end ALLOY: .
sig ObjectA { objectB: lone ObjectB } .
sig ObjectA { objectBs: set ObjectB } .
sig ObjectA { objectB: one ObjectB } .
sig ObjectA { objectBs: set ObjectB } fact { ObjectA <: objectBs = ~(ObjectB <: objectA }
Translating the :through Option
class Book < ActiveRecord::Base has_many :editions belongs_to :author end class Author < ActiveRecord::Base has_many :books has_many :editions, :through => :books end class Edition < ActiveRecord::Base belongs_to :book end
* 1 Author 1 Book 1 * * Edition
} sig Book { editions: set Edition, author: one Author sig Author { books: set Book, editions: set Edition } { editions = books.editions} } sig Edition { book: one Book } fact { Book <: editions = ~(Edition <: book) Book <: authors = ~(Author <: book)
Translating the :dependent Option
• The :dependent option specifies what behavior to take on deletion of an object with regards to its associated objects • To incorporate this dynamism, the model must allow analysis of how sets of objects and their relations
change
from one state to the next class User < ActiveRecord::Base has_one :account end .
class Account < ActiveRecord::Base belongs_to :user, :dependent => :destroy end sig User {} sig Account {} one sig PreState { accounts: set Account, users: set User, relation1: Account lone -> one User } } one sig PostState { accounts ’: set Account, users ’: set User, relation1 ’: Account set -> set User
Translating the :dependent Option
} pred deleteAccount [s: PreState, s ’: PostState, x: Account] { all x0: Account | x0 in s.accounts
all x1: User | x1 in s.users
s ’.accounts’ = s.accounts - x s ’.users’ = s.users
s ’.relation1’ = s’.relation1 – (x <: s.relation1) – We also update relations of its associated object(s) based on the use of the :dependent option
Translating the :dependent Option
pred deleteContext [s: PreState, s': PostState, x:Context] { all x0: Context | x0 in s.contexts
all x1: Note | x1 in s.notes
all x2: Preference | x2 in s.preferences
all x3: Project | x3 in s.projects
all x4: RecurringTodo | x4 in s.recurringtodos
all x5: Tag | x5 in s.tags
all x7: Todo | x7 in s.todos
all x8: User | x8 in s.users
s'.contexts' = s.contexts - x s'.notes' = s.notes
s'.preferences' = s.preferences
s'.projects' = s.projects
s'.recurringtodos' = s.recurringtodos
s'.tags' = s.tags
s'.todos' = s.todos - x.(s.context_todos) s'.users' = s.users
s'.notes_user' = s.notes_user
s'.completed_todos_user' = s.completed_todos_user
s'.recurring_todos_user' = s.recurring_todos_user
s'.todos_user' = s.todos_user - (x.(s.context_todos) <: s.todos_user) s'.active_contexts_user' = s.active_contexts_user
s'.active_projects_user' = s.active_projects_user
s'.projects_user' = s.projects_user
s'.contexts_user' = s.contexts_user - (x <: s.contexts_user) s'.recurring_todo_todos' = s.recurring_todo_todos - (s.recurring_todo_todos :> x.(s.context_todos)) ...
Verification Overview
Active Records Translator Alloy Specification Alloy Analyzer Counter example Data Model Instance Verified Data Model Properties
Experiments
• We used two open-source Rails applications in our experiments: – TRACKS: An application to manage things-to-do lists – Fat Free CRM: Customer Relations Management software
TRACKS
LOC 6062 lines
Fat Free CRM
12069 lines Data model classes 13 classes Alloy spec LOC 301 lines 20 classes 1082 lines • We wrote 10 properties for TRACKS and 20 properties for Fat Free CRM
Types of Properties Checked
• Relationship Cardinality –
Is an Opportunity always assigned to some Campaign?
• Transitive Relations –
Is a Note ’s User the same as the Note ’s Project’s User?
User Note Project
• Deletion Does Not Cause Dangling References –
Are there any dangling Todos after a User is deleted?
• Deletion Propagates to Associated Objects –
Does the User related to a Lead still exist after the Lead has been deleted?
Experimental Results
• Of the 30 properties we checked 7 of them failed • For example, in TRACKS Note’s User can be different than Note ’s Project’s User – Currently being enforced by the controller – Since this could have been enforced using the :through option, we consider this a data-modeling error • Another example from TRACKS: User deletion creates dangling Todos
1 User * Context 1 * Todo :dependent => :delete
– User deletion does not get propagated into the relations of the Context object, including the Todos
Performance
• To measure performance, we recorded – the amount of time it took for Alloy to run and check the properties – the number of variables generated in the boolean formula generated for the SAT-solver • The time and number of variables are averaged over the properties for each application • Taken over an increasing bound, from at most 10 objects for each class to at most 35 objects for each class
Summary
• An approach to automatically discover data model errors in Ruby on Rails web applications • Automatically extract a formal data model, verify using the Alloy Analyzer • An automatic translator from Rails ActiveRecords to Alloy – Handles three basic relationships and several options (:through, :conditions, :polymorphic, :dependent) • Found several data model errors on two open source applications • Bounded verification of data models is feasible!
What About Unbounded Verification?
• Bounded verification does not guarantee correctness for arbitrarily large data model instances • Is it possible to do unbounded verification of data models?
An Approach for Unbounded Verification
Web Application • Ruby on Rails
MVC Design Pattern
Data Model • ActiveRecords
Automatic Extraction Automatic Translation + Automatic Projection + Properties
Formal Model • Sets and Relations Unbounded Verification • SMT Solver
Another Rails Data Model Example
class User < ActiveRecord::Base has_and_belongs_to_many :roles has_one :profile, :dependent => :destroy has_many :photos, :through => :profile end class Role < ActiveRecord::Base has_and_belongs_to_many :users end class Profile < ActiveRecord::Base belongs_to :user has_many :photos, :dependent => :destroy has_many :videos, :dependent => :destroy, :conditions => "format='mp4'" end class Tag < ActiveRecord::Base belongs_to :taggable, :polymorphic => true end class Video < ActiveRecord::Base belongs_to :profile has_many :tags, :as => :taggable end class Photo < ActiveRecord::Base ...
Role * 1 * User 1 * Photo 1 Taggable * 1 0..1
Profile 1 1 format= .
‘mp4’ * Video Tag *
Translation to SMT-LIB
• Given a data model M = we translate the constraints C and D to formulas in the theory of uninterpreted functions • We use the SMT-LIB format • We need quantification for some constraints
Translation to SMT-LIB
• One-to-Many Relation RAILS: class Profile has_many :videos end class Video belongs_to :profile end SMT-LIB: (declare-sort Profile 0) (declare-sort Video 0) (declare-fun my_relation (Video) Profile) .
Translation to SMT-LIB
• One-to-One Relation RAILS: class User has_one :profile end class Profile belongs_to :user end SMT-LIB: (declare-sort User 0) (declare-sort Profile 0) (declare-fun my_relation (Profile) User) .
(assert (forall ((x1 Profile)(x2 Profile)) (=> (not (= x1 x2)) (not (= (my_relation x1) (my_relation x2) )) ) ))
Translation to SMT-LIB
Many-to-Many Relation RAILS: class User has_and_belongs_to_many :roles end class Role has_and_belongs_to_many :users end SMT-LIB: (declare-sort Role 0) (declare-sort User 0) (declare-fun my_relation (Role User) Bool)
Translating the :through Option
class Profile < ActiveRecord::Base belongs_to :user has_many :photos end class Photo < ActiveRecord::Base belongs_to :profile End class User < ActiveRecord::Base has_one :profile has_many :photos, :through => :profile end
0..1
1 User 1 Profile * 1 * Photo
) (declare-sort Profile 0) (declare-sort Photo 0) (declare-sort User 0) (declare-fun profile_photo (Photo) Profile) (declare-fun user_profile (Profile) User) (declare-fun user_photo (Photo) User) (assert (forall ((u User)(ph Photo)) (iff (= u (user_photo ph)) (exists ((p Profile)) (and (= u (user_profile p)) (= p (profile_photo ph)) )) ))
Translating the :dependent Option
• The :dependent option specifies what behavior to take on deletion of an object with regards to its associated objects • To incorporate this dynamism, the model must allow analysis of how sets of objects and their relations
change
from one state to the next class User < ActiveRecord::Base has_one :account, :dependent => :destroy end .
class Profile < ActiveRecord::Base belongs_to :user end (declare-sort Profile 0) (declare-sort User 0) (declare-fun Post_User (User) Bool) (declare-fun Post_Profile (Profile) Bool) (declare-fun user_profile (Profile) User) (declare-fun Post_user_profile (Profile User) Bool)
Translating the :dependent Option
(assert (not (forall ((x User)) (=> (and (forall ((a User)) (ite (= a x) (not (Post_User a)) (Post_User a))) (forall ((b Profile)) (ite (= x (user_profile b)) (not (Post_Profile b)) (Post_Profile b) )) (forall ((a Profile) (b User)) (ite (and (= b (user_profile a)) (Post_Profile a)) (Post_user_profile a b) (not (Post_user_profile a b)) )) ) ;Remaining property-specific constraints go here ))) – Update sets relations of its associated object(s) based on the use of the :dependent option
Verification
• Once the data model is translated to SMT-LIB format we can state properties about the data model again in SMT-LIB and then use an SMT-Solver to check if the property holds in the data model • However, when we do that, for some large models, SMT Solver times out!
• Can we improve the efficiency of the verification process?
Property-Based Data Model Projection
• Basic idea: Given a property to verify, reduce the size of the generated SMT-LIB specification by removing declarations and constraints that do not depend on the property • Formally, given a data model M = and a property p, (M, p) = M P where M P that C P ⊆ = ⟨ S, C P , D P ⟩ C and D P ⊆ D is the projected data model such
Property-Based Data Model Projection
• Key Property: For any property p, M |= p ⇔ (M, p) |= p • Projection Input: Active Record files, property p • Projection Output: The projected SMT-LIB specification • Removes constraints on those classes and relations that are not explicitly mentioned in the property nor related to them based on transitive relations, dependency constraints or polymorphic relations
Data Model Projection: Example
Data Model, M: * Photo 1 Role * 1 * 1 * User 1 0..1
Profile 1 Taggabl e 1 * Video Tag * Property, p: A User’s Photos are the same as the User’s Profile’s Photos.
(M, p) = * Photo 1 * 1 User 1 0..1
Profile
Verification Overview
Data Model Properties Active Records Translator Formal Data Model Projection SMT-LIB Specification SMT Solver (Z3) Counter example Data Model Instance Unknown Verified
Experiments
• We used five open-source Rails apps in our experiments: – LovdByLess: Social networking site – Tracks: An application to manage things-to-do lists – OpenSourceRails(OSR): Social project gallery application – Fat FreeCRM: Customer relations management software – Substruct: An e-commerce application
LovdB y Less
LOC 3787
Tracks
6062
OSR
4295
Fat Free CRM
12069
Substru ct
15639 Data Model Classes 13 13 15 20 • We wrote 10 properties for each application 17
Types of Properties Checked
• Relationship Cardinality –
Is an Opportunity always assigned to some Campaign?
• Transitive Relations –
Is a Note ’s User the same as the Note ’s Project’s User?
User Note Project
• Deletion Does Not Cause Dangling References –
Are there any dangling Todos after a User is deleted?
• Deletion Propagates to Associated Objects –
Does the User related to a Lead still exist after the Lead has been deleted?
Experimental Results
• 50 properties checked, 16 failed, 11 were data model errors • For example in Tracks, a Note’s User can be different than Note ’s Project’s User – Currently being enforced by the controller – Since this could have been enforced using the :through option, we consider this a data-modeling error • From OpenSourceRails: User deletion fails to propagate to associated Bookmarks
1 User * Bookmark
– Leaves orphaned bookmarks in database – Could have been enforced in the data model by setting the :dependent option on the relation between User and Bookmark
Performance
• To measure performance, we recorded – The amount of time it took for Z3 to run and check the properties – The number of variables produced in the SMT specification • The time and number of variables are averaged over the properties for each application
Performance
• To compare with bounded verification, we repeated these experiments using the tool from our previous work and Alloy Analyzer – The amount of time it took for Alloy to run – The number of variables generated in the boolean formula generated for the SAT solver – Taken over an increasing bound, from at most 10 objects for each class to at most 35 objects for each class
Performance: Verification Time
8 2 0 6 4
Tracks
10 15 20 25 30 35 8 2 0 6 4
Substruct
10 15 20 25 30 35 2,5 2 1,5 1 0,5 0
OSR
10 15 20 25 30 35 25 20 15 10 5 0
FatFreeCRM
10 15 20 25 30 35 2,5 2 1,5 1 0,5 0
LovdByLess
10 15 20 25 30 35
Scope
Alloy Z3 Z3+proj
Performance: Formula Size
(Variables)
Z3 Alloy 200 150 100 50 0 non-proj proj 800 600 400 200 0 10 15 20 25 30 35
Scope
LovdByLess OSR FatFreeCRM Tracks Substruct
Unbounded vs Bounded Performance
• Why does unbounded verification out-perform bounded so drastically? Possible reasons: • SMT solvers operate at a higher level of abstraction than SAT solvers • Z3 uses many heuristics to eliminate quantifiers in formulas • Implementation languages are different – Z3 implemented in C++ – Alloy (as well as the SAT Solver it uses) is implemented in Java
Summary
• Automatically extract a formal data model, translate it to the theory of uninterpreted functions, and verify using an SMT solver – Use property-based data model projection for efficiency • An automatic translator from Rails ActiveRecords to SMT LIB – Handles three basic relationships and several options (:through, :conditions, :polymorphic, :dependent) • Found multiple data model errors on five open source applications – Unbounded verification of data models is feasible and more efficient than bounded verification!
Possible Extensions
• Analyzing dynamic behavior – Model object creation in addition object deletion – Fuse the data model with the navigation model in order to analyze dynamic data model behavior – Check temporal properties • Automatic Property Inference – Manual property writing is error prone – Use the inherent graph structure in the the data model to automatically infer properties about the data model • Automatic Repair – When verifier concludes that a data model is violated, automatically generate a repair that establishes the violated property