by
David A. Nader
HierarchiCal PrObabilistic Model for SoftwarE Traceability
Type X Artifacts
HierarchiCal PrObabilistic Model for SoftwarE Traceability
Type X Artifacts
Software Artifacts:
Type X Artifacts
Type Y Artifacts
Type X Artifacts
Type Y Artifacts
Type X Artifacts
Type Y Artifacts
Software Traceability
Type X Artifacts
Type Y Artifacts
Software Traceability by Information Retrieval
Type X Artifacts
Type Y Artifacts
Software Traceability by Probabilistic Reasoning
General Layer
Component Layer
Business Layer
Test Suites
Facade
Information Retrieval Facade
Causality Facade
Information Retrieval
Intervention
Association
Theano
Futures
IR Test Cases
Association Test Cases
IR Facade Test Cases
Causality Facade Test Cases
General Facade TC
Async Causality Facade
Async non-functional TC
Futures
Multi-Process
Serial
Gevent
cProfiler
pstats
unittest
Business Layer
Information Retrieval
Intervention
Association
Business Layer
Information Retrieval
Intervention
Association
Theano
Futures
Component Layer
Business Layer
Information Retrieval Facade
Causality Facade
Information Retrieval
Intervention
Association
Theano
Futures
General Layer
Component Layer
Business Layer
Facade
Information Retrieval Facade
Causality Facade
Information Retrieval
Intervention
Association
Theano
Futures
General Layer
Component Layer
Business Layer
Test Suites
Facade
Information Retrieval Facade
Causality Facade
Information Retrieval
Intervention
Association
Theano
Futures
unittest
General Layer
Component Layer
Business Layer
Test Suites
Facade
Information Retrieval Facade
Causality Facade
Information Retrieval
Intervention
Association
Theano
Futures
IR Test Cases
Association Test Cases
IR Facade Test Cases
Causality Facade Test Cases
General Facade TC
unittest
Bottlenecks are located in the serial computation of probabilistic models (Markovian Montecarlo and Variational Inference)
We are interested in enhancing time (or reducing time complexity)
General Layer
Component Layer
Business Layer
Test Suites
Facade
Information Retrieval Facade
Causality Facade
Information Retrieval
Intervention
Association
Theano
Futures
IR Test Cases
Association Test Cases
IR Facade Test Cases
Causality Facade Test Cases
General Facade TC
unittest
General Layer
Component Layer
Business Layer
Test Suites
Facade
Information Retrieval Facade
Causality Facade
Information Retrieval
Intervention
Association
Theano
Futures
IR Test Cases
Association Test Cases
IR Facade Test Cases
Causality Facade Test Cases
General Facade TC
Async Causality Facade
Futures
Multi-Process
Serial
Gevent
unittest
General Layer
Component Layer
Business Layer
Test Suites
Facade
Information Retrieval Facade
Causality Facade
Information Retrieval
Intervention
Association
Theano
Futures
IR Test Cases
Association Test Cases
IR Facade Test Cases
Causality Facade Test Cases
General Facade TC
Async Causality Facade
Async non-functional TC
Futures
Multi-Process
Serial
Gevent
cProfiler
pstats
unittest
class AsyncCausalityTestCase(unittest.TestCase):
def setUp(self):
self.numLinks = 2
self.listLinks = [list_associationlink_generator()
for i in range( self.numLinks)] #For Two Links
self.AssociationF = FutureCausalityAssociation(
link_pool=self.listLinks,
max_workers=self.numLinks,
progressbar=False
)
"""init each test"""
#self.testtree = SplayTree(1000000)
self.pr = cProfile.Profile()
self.pr.enable()
print("\n<<<---")
def tearDown(self):
"""finish any test"""
p = Stats(self.pr)
p.strip_dirs()
p.sort_stats('cumtime')
p.print_stats()
print("\n--->>>")
Async non-functional TC
cProfiler
pstats
Executor Object
Executor Object
with ThreadPoolExecutor(max_workers=1) as executor:
future = executor.submit(pow, 323, 1235)
print(future.result())
Executor Object
ThreadPoolExecutor
ProcessPoolExecutor
Reference: https://docs.python.org/3/library/concurrent.futures.html#concurrent.futures.Future
Executor Object
ThreadPoolExecutor
ProcessPoolExecutor
from concurrent.futures import ThreadPoolExecutor
from time import sleep
def return_after_5_secs(message):
sleep(5)
return message
pool = ThreadPoolExecutor(3)
future = pool.submit(return_after_5_secs, ("hello"))
print(future.done())
sleep(5)
print(future.done())
print(future.result())
Executor Object
ThreadPoolExecutor
ProcessPoolExecutor
from concurrent.futures import ProcessPoolExecutor
from time import sleep
def return_after_5_secs(message):
sleep(5)
return message
pool = ProcessPoolExecutor(3)
future = pool.submit(return_after_5_secs, ("hello"))
print(future.done())
sleep(5)
print(future.done())
print("Result: " + future.result())
Async Model
ThreadPoolExecutor
ProcessPoolExecutor
Probabilistic Inference Computations
Extendability and Operability
Async Causality Facade
Futures
Multi-Process
Serial
Gevent
Causality Facade
Baseline SerialPoolAssociation
A For-Comprehension in python
def SerialPoolAssociation(self): #For Comprenhension
links = [self.test_ltr_holistic(link) for link in self.__link_pool]
print(links)
return links
#Test Serial
def test_SerialPoolAssociation(self):
x = self.AssociationF.SerialPoolAssociation()
self.assertEqual(len(x),self.numLinks)
#Testing Threading for Concurrency
def test_ThreadPoolAssociation(self):
x = self.AssociationF.ThreadPoolAssociation()
self.assertTrue(x)
#Testing MultiProcessing for Parallelism
def test_ProcessPoolAssociation(self):
x = self.AssociationF.ProcessPoolAssociation()
self.assertTrue(x)
#Testing Gevent
def test_GeventPoolAssociation(self):
x = self.AssociationF.GeventPoolAssociation()
self.assertTrue(x)
Test Cases
Results for 5 runs:
Min: 26554014 function calls (26157566 primitive calls) in 86.554 seconds
ThreadPoolAssociation
Futures Multi-Threading
def ThreadPoolAssociation(self):
# We can use a with statement to ensure threads are cleaned up promptly
with concurrent.futures.ThreadPoolExecutor(max_workers=self.__max_workers) as executor:
# Start the load operations and mark each future with its Links
future_to_url = {executor.submit(self.test_ltr_holistic, link): link for link in self.__link_pool}
for future in concurrent.futures.as_completed(future_to_url):
link = future_to_url[future]
try:
data = future.result()
except Exception as exc:
print('%r generated an exception: %s' % (link, exc))
else:
print('%r link is %f probable' % (link, data))
return True
#Test Serial
def test_SerialPoolAssociation(self):
x = self.AssociationF.SerialPoolAssociation()
self.assertEqual(len(x),self.numLinks)
#Testing Threading for Concurrency
def test_ThreadPoolAssociation(self):
x = self.AssociationF.ThreadPoolAssociation()
self.assertTrue(x)
#Testing MultiProcessing for Parallelism
def test_ProcessPoolAssociation(self):
x = self.AssociationF.ProcessPoolAssociation()
self.assertTrue(x)
#Testing Gevent
def test_GeventPoolAssociation(self):
x = self.AssociationF.GeventPoolAssociation()
self.assertTrue(x)
Test Cases
Results for 5 runs:
Min: 366 function calls in 53.179 seconds
ProcessPoolAssociation
Future Process based on Multi-Processing Python
def ProcessPoolAssociation(self):
#print("entra a process 0.5")
with concurrent.futures.ProcessPoolExecutor() as executor:
for link, data in zip(self.__link_pool, executor.map(
self.test_ltr_holistic,
self.__link_pool,
timeout=300)
):
print('%r is link: %f' % (link, data))
return True
#Test Serial
def test_SerialPoolAssociation(self):
x = self.AssociationF.SerialPoolAssociation()
self.assertEqual(len(x),self.numLinks)
#Testing Threading for Concurrency
def test_ThreadPoolAssociation(self):
x = self.AssociationF.ThreadPoolAssociation()
self.assertTrue(x)
#Testing MultiProcessing for Parallelism
def test_ProcessPoolAssociation(self):
x = self.AssociationF.ProcessPoolAssociation()
self.assertTrue(x)
#Testing Gevent
def test_GeventPoolAssociation(self):
x = self.AssociationF.GeventPoolAssociation()
self.assertTrue(x)
Test Cases
Results for 5 runs:
Min: 1217 function calls in 87.191 seconds
Gevent
Gevent Pool
def GeventPoolAssociation(self):
pool = Pool(self.__max_workers)
for link in self.__link_pool:
pool.spawn(self.test_ltr_holistic, link)
# Wait for stuff to finish
pool.join()
#print("this is the pool", set(pool))
return True
#Test Serial
def test_SerialPoolAssociation(self):
x = self.AssociationF.SerialPoolAssociation()
self.assertEqual(len(x),self.numLinks)
#Testing Threading for Concurrency
def test_ThreadPoolAssociation(self):
x = self.AssociationF.ThreadPoolAssociation()
self.assertTrue(x)
#Testing MultiProcessing for Parallelism
def test_ProcessPoolAssociation(self):
x = self.AssociationF.ProcessPoolAssociation()
self.assertTrue(x)
#Testing Gevent
def test_GeventPoolAssociation(self):
x = self.AssociationF.GeventPoolAssociation()
self.assertTrue(x)
Test Cases
Results for 5 runs:
Min: 27810198 function calls (27391058 primitive calls) in 92.336 seconds
Process | Threads |
---|---|
Processes don't share memory | Threads share memory |
Spawning/switching processes is expensive | Spawning/switching threads is less expensive |
Processes require more resources | Threads require fewer resources (are sometimes called lightweight processes) |
No memory synchronization needed | You need to use synchronization mechanisms to be sure you're correctly handling the data |
Ecosystem | Times[s] |
---|---|
Gevent | 106.62 ± 12.18 |
Multiprocessing | 92.67 ± 5.99 |
concurrent.futures | 78.88 ± 40.95 |
for-comprehension | 93.86 ± 5.92 |
ref: https://code.tutsplus.com/articles/introduction-to-parallel-and-concurrent-programming-in-python--cms-28612
Questions?
The actor model in computer science is a mathematical model of concurrent computation that treats "actors" as the universal primitives of concurrent computation.
Reference: https://en.wikipedia.org/wiki/Actor_model
All Actors run independently within the Actor System. The Actor System may run the Actors as threads, processes, or even sequential operations within the current process—all with no change to the Actors themselves.
Reference: https://thespianpy.com/doc/
Concurrent
Distributed
Actors run independently…anywhere. Multiple servers can each be running The Library and an Actor can be run on any of these systems—all with no change to the Actors themselves. The Library handles the communication between the Actors and the management process of distributing the Actors across the systems.
Reference: https://thespianpy.com/doc/
Concurrent
Distributed
Location Independent
Because Actors run independently anywhere, they run independently of their actual location. A distributed Actor application may have part of it running on a local server, part running on a server in Amsterdam, and part running on a server in Singapore… or not, with no change or awareness of this by the Actors themselves.
Reference: https://thespianpy.com/doc/
Concurrent
Distributed
Location Independent
Fault Tolerant
Individual Actors can fail and be restarted—automatically—without impact to the rest of the system.
Reference: https://thespianpy.com/doc/
Concurrent
Distributed
Location Independent
Fault Tolerant
Scalable
The number of Actors in the system can be dynamically extended based on factors such as work volume, and systems added to the Distributed Actor System environment are automatically utilized.
Reference: https://thespianpy.com/doc/
Reference: https://thespianpy.com/doc/
from thespian.actors import *
class Hello(Actor):
def receiveMessage(self, message, sender):
self.send(sender, 'Hello, World!')
if __name__ == "__main__":
hello = ActorSystem().createActor(Hello)
print(ActorSystem().ask(hello, 'hi', 1))
ActorSystem().tell(hello, ActorExitRequest())
Reference: https://thespianpy.com/doc/
from thespian.actors import *
class Hello(Actor):
def receiveMessage(self, message, sender):
self.send(sender, 'Hello, World!')
if __name__ == "__main__":
hello = ActorSystem().createActor(Hello)
print(ActorSystem().ask(hello, 'hi', 1))
ActorSystem().tell(hello, ActorExitRequest())
$ python helloActor.py
Hello, World!
$