450 likes | 464 Views
Learn the core skills of programming graphical user interfaces (GUIs) for geographical information analysis, including web-based interfaces. Explore topics such as web page analysis, JavaScript, TkInter, and event-based programming.
E N D
Programming for Geographical Information Analysis: Core Skills Graphical Interfaces: GUIs and the Web
This lecture A brief introduction to Graphical User Interfaces. How the web works. Webpage analysis. JavaScript.
Graphical User Interfaces (GUIs) In general, Python isn't much used for user interfaces, but there's no reason for not doing so. Build up WIMP (Windows; Icons; Menus; Pointers) Graphical User Interfaces (GUIs). The default package to use is TkInter. This is an interface to the basic POSIX language Tcl ("Tickle", the Tool Command Language) and its GUI library Tk. Also more native-looking packages like wxPython: https://www.wxpython.org/
Basic GUI import tkinter root = tkinter.Tk() # Main window. c = tkinter.Canvas(root, width=200, height=200) c.pack() # Layout c.create_rectangle(0, 0, 200, 200, fill="blue") tkinter.mainloop() # Wait for interactions.
Event Based Programming Asynchronous programming, where you wait for user interactions. In Python, based on callbacks: where you pass a function into another, with the expectation that at some point the function will be run. You register or bind a function with/to an object on the GUI. When an event occurs, the object calls the function.
Simple event import tkinter def run(): pass root = tkinter.Tk() menu_bar = tkinter.Menu(root) root.config(menu=menu_bar) model_menu = tkinter.Menu(menu_bar) menu_bar.add_cascade(label="Model", menu=model_menu) model_menu.add_command(label="Run model", command=run) tkinter.mainloop()
The user experience Many people design for geeks. Users learn by trying stuff - they rarely read manuals, so think carefully about what the default behavior of any function should be. We need to design for the general public, butmake advanced functions available for those that want them. We should try to help the user by... Using familiar keys and menus (e.g. Ctrl + C for copy). Including help systems and tutorials.
Designing for users At every stage when designing the GUI, think "is it obvious what this does?" Make all screens as simple as possible. Turn off functionality until needed, e.g.: model_menu.entryconfig("Run model", state="disabled") # Until the user has chosen files, then: model_menu.entryconfig("Run model", state="normal") Hide complex functionality and the options to change defaults in ‘Options’ menus. Most of all consult and test. There is a formal element of software development called 'usability testing' in which companies watch people trying to achieve tasks with their software.
This lecture A brief introduction to Graphical User Interfaces. How the web works. Webpage analysis. JavaScript.
The internet and web The internet: fundamental network over which different communication technologies can work (including email, file transfer, the web). Generally regarded as those elements using the TCP/IP protocol (more later). The web: a hypertext (linked text) system with embedded media based within/utilising the internet.
Python and the internet Mainly used for data retrieval and processing, but can be used for backend web work and internet communication. To understand the web-based elements we need to first understand the basics of the web and webpages.
The web The web has a client-server architecture. Server or ‘Host’ Request Client Server Program Web browser Files returned
Setting up servers The key element of a client-server system is the socket ("Berkley socket "; "POSIX socket"). Sockets are connections between machines which you can connect streams to. You can then write and read data to/from them. Basic operation is for a client to contact a machine via and address and a port number. The server program sits on a machine and listens to the port waiting for contact using a server socket. When contact occurs, it generates a standard socket connected at the other end to the client socket.
Client import socket import sys socket_1 = socket.socket(socket.AF_INET, socket.SOCK_STREAM) socket_1.connect(("localhost", 5555)) # Address tuple socket_1.send(bytes("hello world", encoding="UTF-8")) Here the address of the machine we're trying to connect to is "localhost"; this indicates the local machine we're on. "5555" is the "port" number. We'll come back to this shortly. We're sending the data as bytes representing a Unicode encoded string.
Server import socket serversocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM) serversocket.bind(('localhost', 5555)) serversocket.listen() (socket_2, address) = serversocket.accept() b = socket_2.recv(30) print(str(b)) As we've bound this program to the localhost we'd need to run this on the machine the client is on. For internet enabled code we'd change this to the address of the machine (we'll come back to this shortly). The program is set to receive up to 30 bytes of data. It will print "b'hello world'". The "b" indicates binary data as that's how the client sent it.
The web / internet A full internet or web application also has to: Open multiple sockets to different clients. One server can serve several clients at once. Send data of arbitrary lengths. Deal with potential security issues. The core technology is the same though.
Client – Server architecture Here we've sent relatively plain binary data representing text. However, it would be usual to have a more complicated format known as a protocol, which the server recognises and processes. E-mails are sent out to servers using Simple Mail Transfer Protocol (SMTP) Webpages are sent out from servers using the HyperText Transfer Protocol (HTTP).
Introduction to network communications Several protocols may be involved at once. Most computers use "TCP/IP" when communicating with network nodes and other computers on the internet. Internet Protocol (IP) Used to split data into small chunks called “packets” Addresses them to the right machine. Transport Control Protocol (TCP) Guarantees packets get to their destination. Controls the route taken and lets computers confirm receipt. Adds packets back together in the right order. Protocols like HTTP then format the data carried by these low-level protocols which is split into packets.
Ports Ports are numerical handles which software can associate with; one piece of software per port. Which server program gets which messages will depend on the port they are sent to, which is usually related to the transmission protocol. The computer looks at the port number associated with the message and diverts it to the registered software. e-mails are sent out to servers using Port 25. Webservers use port 80. We used port 5555 as the first 1024 ports are allocated to specific purposes and protocols.
IP addresses An important element of the system is the Internet Protocol (IP) addresses of the machines to receive the messages. IP addresses are numeric: 129.11.87.11is the School webserver. However, many network machines hold a registry which contains the numbers and what they’d prefer to be called. This scheme is called the Domain Name Service (DNS). www.geog.leeds.ac.ukis the domain name of 129.11.87.11 In order for code to use domain names, they must perform a DNS lookup, contacting the nearest DNS. localhost is a special name that maps to 127.0.0.1, which always means the local machine you're on.
Ports and Firewalls If you set up client/server software using sockets, always check what other programs, if any, use the port you are on. Some networks have “Firewalls”. These are security devices that sit on ports and stop some connecting. Check for them. In general, scanning to see which ports are open ("port scanning") is regarded as suspicious behaviour, so keep in close contact with your local IT team.
Understanding URLs A client-server system based around port 80 and the HTTP. When a server gets a request it is usually to send out a webpage from a directory on the server. The file is usually referenced using a Uniform Resource Locator (URL). http://www.w3.org:80/People/Berners-Lee/Overview.html A URL represents another file on the network. It’s comprised of… • A method for locating information – i.e. a transmission protocol, e.g. the HyperText Transmission Protocol (http). • A host machine name, e.g. www.w3.org • A path to a file on that server, e.g. /People/Berners-Lee/Overview.html • A port to connect to that server on, e.g. http connects to port 80.
The Web Web pages consist of text that will be displayed and tags that won't, which include formatting details or references to other files like images or javascript code. The tags are referred to as the HyperTextMarkup Language (HTML). Saved as text files with the suffix .html (or sometimes .htm). Note that if the filename is missing from the URL, the default servers will look to send is index.html You can also look at webpages "locally", that is directly on your harddrive, though some elements may not work properly unless served, especially those involved in downloading data.
A basic webpage <HTML> <HEAD> <TITLE>Title for top of browser</TITLE> </HEAD> <BODY> <!--Stuff goes here; this is a comment--> </BODY> </HTML> The HEAD contains information about the page, the BODY contains the actual information to make the page. Note tags are not case sensitive.
<BODY> The break tag breaks a line<BR /> like that. <P> The paragraph tags </P> leave a line. This is <B>Bold</B>. This is <I>Italic</I> <IMG src=“tim.gif” alt=“Photo: Pic of Tim” width=“50” height=“50”></IMG> <A href=“index.html”>Link text</A> </BODY> The text in the file will only be shown with the format set out in the tags. Any line breaks etc. won’t show up on screen. Tags can have 'attributes', like the href attribute in the Anchor tag "A".
Tables A lot of data is held in tables: <TABLE> <TR><TH>y</TH><TH>x</TH><TH>z</TH></TR> <TR><TD>2</TD><TD>5</TD><TD>3</TD></TR> <TR><TD>4</TD><TD>3</TD><TD>0</TD></TR> <TR><TD>3</TD><TD>1</TD><TD>5</TD></TR> </TABLE>
Document Object Model (DOM) As tags are nested, HTML can be thought of as a tree structure called the Document Object Model (DOM). Each element is a child of some parent. Document has a root. We can regard each element containing text as containing some innerHTML.
Classes and IDs Elements may be given classes (generic groupings) and IDs (names specific to themselves) as attributes. <TABLE class="datatable" id="yxz"> <TR><TD class='y'>73</TD></TR>
Cascading Style Sheets In general we try to separate out the look of websites from their content. The look is stored in something called a Cascading Style Sheet (CSS). These link elements with looks. They are linked to the HTML in the HEAD with the following tag: <link rel="stylesheet" href="http://www.geog.leeds.ac.uk/courses/computing/css/doublePage.css"> or if in same directory: <link rel="stylesheet" href="doublePage.css">
Example /* All tables */ TABLE { border: 1px solid black; } /* All tables of class datatable */ TABLE.datatable { margin: 10px; } /* Tables of ID yxz */ TABLE#yxz { background-color: white; } /* TD in Tables of ID yxz */ TABLE#yxz td { padding: 10px; }
Good web design Like any GUI, good web design concentrates on usability. There are a number of websites that can help you - these are listed on the links page for this course. See also the tutorial on the course pages.
Web accessibility If you are working for a public organisation, accessibility for the disabled has to be a major design driver. Generally you can make webpages accessible by not putting important information in images and sound.
This lecture A brief introduction to Graphical User Interfaces. How the web works. Webpage analysis. JavaScript.
Getting web pages First we need to get the webpage by issuing a HTTP request. The best option for this is the requests library that comes with Anaconda: http://docs.python-requests.org/en/master/ r = requests.get('https://etc', auth=('user', 'pass')) The username and password is optional. To get the page: content = r.text
Other variables and functions r.status_code HTTP status codes returned by servers as well as any HTML and files: https://en.wikipedia.org/wiki/List_of_HTTP_status_codes 200 OK 204 No Content 400 Bad Request 401 Unauthorized 403 Forbidden 404 Not Found 408 Request Timeout 500 Internal Server Error 502 Bad Gateway (for servers passing on requests elsewhere) 504 Gateway Timeout (for servers passing on requests elsewhere)
JSON You can use requests to get JSON files from the web and translate it into a Python object similar to the mix of dicts and lists of the json library. json_object = r.json() http://docs.python-requests.org/en/master/user/quickstart/#json-response-content
Other options Ability to deal with cookies. Ability to pass parameters to servers in a variety of ways. Ability to maintain sessions with a server. Ability to issue custom headers representing different browsers ("user-agent"), etc. Ability to deal with streaming.
Processing webpages Best library for this is beautifulsoup: https://www.crummy.com/software/BeautifulSoup/ soup = bs4.BeautifulSoup(content, 'html.parser')
How to get elements Getting elements by ID or other attributes: table = soup.find(id="yxz") tds = soup.find_all(attrs={"class" : "y"}) Getting all elements of a specific tag: trs = table.find_all('tr') for tr in trs: # Do something with the "tr" variable. Getting elements inside another and get their innerHTML: tds = tr.find_all("td") for td in tds: print (td.text) All tags are lowercased during search.
This lecture A brief introduction to Graphical User Interfaces. How the web works. Webpage analysis. JavaScript.
Client side coding Generally done in JavaScript. Very similar to Python. Each statement ends in a semicolon; Blocks are defined by {} function dragStart(ev) {} if (a < b) { } else { } for (a = 0; a < b; a++) {} var a = 12; var a = [1,2,3]; // Comment /** * Comment **/
Getting elements in Javascript document is the root of the page. var a = document.getElementById("yxz") var a = document.getElementsByClassName("datatable"); vartds = document.getElementsByTagName("TD"); Getting text: alert(tds[0].innerHTML) // popup box console.log(tds[0].innerHTML ) // Browser console (F12 to open with most) Setting text: tds[0].innerHTML = "2";
Connecting JavaScript JavaScript is largely run through Event Based Programming. Each HTML element has specific events associated with it. We attach a function to run to these thus: <SPAN id="clickme" onclick="functionToRun()">Push</SPAN> <BODY onload="functionToRun()">
Where to put JavaScript Functions placed between <script> </script> tags in either the head or body. In the body code will run in the order the page loads if not in functions. Alternatively, can be in an external script linked to with a filename or URL in the body or head, thus: <script src="script.js"></script>
Example <HTML> <HEAD> <SCRIPT> function clicked() { var a = document.getElementById("clickme"); a.innerHTML = "changed"; } </SCRIPT> </HEAD> <BODY> <SPAN id="clickme" onclick="clicked()">Push</SPAN> <BODY> </HTML>