110 likes | 126 Views
Learn about the variable concept in Opus, problems with defining Opus variables in Python, Tekoa examples, syntax, status, future work, and user discussion. Explore how Opus implements model variables, differences from Python, and Tekoa syntax with examples. Discover aggregation through multiple geographies and a more complex example of Tekoa usage with syntax details provided. Find out how Tekoa enhances the efficiency of defining variables and glimpse into future developments.
E N D
Tekoa: A Domain-Specific Language for Defining Opus Variables • The variable concept in Opus • Problems with defining Opus variables in Python • Tekoa examples • Syntax • Status and Plans for Further Work • User discussion & wish list
The Variable Concept in Opus • A model variable (or just variable) is an attribute of actors or geographies used in a model. • Variables are properties of datasets, e.g. a gridcell dataset or a parcel dataset • Examples: • Population density • Land cost • Travel time to city center • Two kinds: • Primary attribute • Derived attribute • Not the same as “variable” as used in programming languages
Implementing Variables • Opus implements a model variable as a subclass of the Python class Variable • Uses lazy evaluation • Methods • dependencies() • compute() • This has worked very well from the point of view of accessing and computing variables • However, defining a new variable (even a simple one) requires writing a new Python class, ideally including a unit test
Variables in Python vs. Tekoa % definition of zone.average_income in Python from opus_core.variables.variable import Variable class average_income(Variable): def dependencies(self): return ["household.income", "zone.zone_id”, "urbansim_parcel.household.zone_id”] def compute(self, dataset_pool): households = dataset_pool.get_dataset("household”) return self.get_dataset().aggregate_dataset_over_ids( households, "mean", "income") % *** code for unit tests omitted *** ______________________________________________ % Tekoa definition average_income = zone.aggregate(household.income, function=mean)
Tekoa - Aggregation through multiple geographies % employment in the ‘large_area’ geography employment=large_area.aggregate(urbansim_parcel.building.number_of_jobs, intermediates=[parcel, zone, faz]) Explanation: • number_of_jobs is an attribute of building. We then aggregate this up to the parcel level, then the zone level, then the faz level, and finally the large_area level, to find the employment in the large_area. • The ‘employment=’ part gives an alias for the expression, so that it displays nicely in the resulting indicator.
Tekoa - More Complex Example % definition of parcel.is_pre_1940 % is the average building age for a parcel % older than 1940? is_pre_1940 = parcel.aggregate(building.year_built *numpy.ma.masked_where(urbansim_parcel.building.has_valid_year_built==0, 1), function=mean) < 1940
Syntax • Syntax is a subset of Python • An expression can be: • The name of a variable • A function or operator applied to other expressions • All of the numpy functions and operators are available, e.g. exp, sqrt, +, -, ==, < • numpy-style array and matrix operations — for example, 1.2*household.incomescales all the elements of the array of incomes • Aggregation • Intermediates argument -- list of intermediate datasets • Function - can be sum, mean, median, min, max • Disaggregation also supported
Interaction Sets and Expressions • InteractionDataset is a subclass of Dataset, which stores its data as a 2-d array • For example, for household location choice we are interested in the interaction between household income and cost per residential unit • The expression ln(household.income) * zone.average_housing_cost)returns an nm array where n is the number of households and m is the number of zones
Implementation • When a new Tekoa expression is encountered, the system: • parses it (using the Python parser) • analyzes the expression for dependencies on other variables and special methods (e.g. aggregate, disaggregate) • compiles a new Python class that defines the variable, including a dependencies() and a compute() method • Recursively compiles a new variable when aggregating/disaggretating an expression • Consequence: efficiency of expressions is the same as for the old-style definitions • The system maintains a cache of expressions that have already been compiled, so that if the same expression is encountered again the previously-compiled class is just returned
More Examples and Documentation • For lots of examples, see the aliases.py for various datasets in the urbansim_parcel package, e.g. • urbansim_parcel/buildings/aliases.py • urbansim_parcel/job/aliases.py • … • The language is described in Section 6.4 of the Opus/Urbansim User Manual • Also see: Alan Borning, Hana Sevcikova, and Paul Waddell, “A Domain-Specific Language for Urban Simulation Variables”, to appear, International Conference on Digital Government Research, Montreal, Canada, May 2008.
Tekoa Status and Future Work • Benefits: • significantly reduced code size (factor of 7 for urbansim gridcell vs urbansim parcel) • increased modeler productivity • Additional features to implement: • Parameterized expressions. For example is_pre_1940 should really be is_pre(1940) • Better error detection and messages • Tutorial & advanced techniques • Replace old variable definitions in the code base for gridcell model system with expressions (big job) • Integration of expressions with GUI • User discussion & wish list?