60 likes | 164 Views
Design Document Example. Problem Description. Find all occurrences of a string within text files Start search in all drives on the local machine Search all folders and subfolders Only search files with a specific name pattern Report the name of the files containing match
E N D
Problem Description • Find all occurrences of a string within text files • Start search in all drives on the local machine • Search all folders and subfolders • Only search files with a specific name pattern • Report the name of the files containing match • Report total number of files matching See the example code in the PPCP Unit 2 source code. Filename: ArchitecturalPatternsTests.cs Method: CompositionTest_FindFilesContainingText()
Pipeline • Producer: Enumerate thru each drive and add the path as a folder to search (into folder queue) • Worklist: Traverse FoldersFor each folder in search queue: • Find files matching given search pattern(s) and pass to next stage. • Enumerate sub folders and pass back to folder queue. This is where the worklist pattern is used.
Pipeline (cont.) • Stage: Search text file for an occurrence of string. Matching files should pass on the path to the next stage via another blocking collection. • Final Stage: Report each matching file to output. Increment the count of matching files. • After all stages: Report the total number of files matched.
Concurrency & Parallelism Since we’ve described our solution in terms of parallel pattern names, it’s easy to identify concurrency. • Each stage (consumer, stage, worklist) can be done concurrently. • Furthermore, we can create multiple workers per stage to process work in parallel. • Since we’re using a pipeline approach, it’s most likely that simple Parallel.For/ForEach may not be suited; but we can try.
Limitations & Shared Resources • Producer would be a single worker since it needs to control and prevent duplicate starting folders. • Each stage would need input and output queues (perhaps via blocking collections). • Final Stage: • Needs to increment a global counter of how many files were found. • Needs to output to Console one result at a time. If we use the Console.WriteLine, the thread-safety is guaranteed.