Tag Archives: analysis

New: List of Software for UAVs and Aerial Imagery

My research team and I at the Humanitarian UAV Network (UAViators) have compiled a list of more than 30 common software platforms used to operate UAVs and analyze resulting aerial imagery. We carried out this research to provide humanitarian organizations with a single repository where they can review existing software platforms (including free & open source solutions) for their humanitarian UAV missions. The results, available here, provide a brief description of each software platform along with corresponding links for additional information and download. We do realize that this list is not (yet) comprehensive, so we hope you’ll help us fill remaining gaps. This explains why we’ve made our research available as an open, editable Google Doc.

UAV software

Many thanks to my research assistant Peter Mosur for taking the lead on this. We have additional research documents available here on the UAViators website.

bio

See also:

  • Humanitarian UAV Network: Strategy for 2014-2015 [link]
  • Humanitarians in the Sky: Using UAVs for Disaster Response [link]
  • Low-Cost UAV Applications for Post-Disaster Damage Assessments: A Streamlined Workflow [Link]

Low-Cost UAV Applications for Post-Disaster Assessments: A Streamlined Workflow

Colleagues Matthew Cua, Charles Devaney and others recently co-authored this excellent study on their latest use of low-cost UAVs/drones for post-disaster assessments, environmental development and infrastructure development. They describe the “streamlined workflow—flight planning and data acquisition, post-processing, data delivery and collaborative sharing,” that they created “to deliver acquired images and orthorectified maps to various stakeholders within [their] consortium” of partners in the Philippines. They conclude from direct hands-on experience that “the combination of aerial surveys, ground observations and collaborative sharing with domain experts results in richer information content and a more effective decision support system.”

Screen Shot 2014-10-03 at 11.26.12 AM

UAVs have become “an effective tool for targeted remote sensing operations in areas that are inaccessible to conventional manned aerial platforms due to logistic and human constraints.” As such, “The rapid development of unmanned aerial vehicle (UAV) technology has enabled greater use of UAVs as remote sensing platforms to complement satellite and manned aerial remote sensing systems.” The figure above (click to enlarge) depicts the aerial imaging workflow developed by the co-authors to generate and disseminate post-processed images. This workflow, the main components of which are “Flight Planning & Data Acquisition,” “Data Post-Processing” and “Data Delivery,” will “continuously be updated, with the goal of automating more activities in order to increase processing speed, reduce cost and minimize human error.”

Screen Shot 2014-10-03 at 11.27.02 AM

Flight Planning simply means developing a flight plan based on clearly defined data needs. The screenshot above (click to enlarge) is a “UAV flight plan of the coastal section of Tacloban city, Leyte generated using APM Mission Planner. The [flight] plan involved flying a small UAV 200 meters above ground level. The raster scan pattern indicated by the yellow line was designed to take images with 80% overlap & 75% side overlap. The waypoints indicating a change in direction of the UAV are shown as green markers.” The purpose of the overlapping is to stitch and accurately geo-referenced the images during post-processing. A video on how to program UAV flight is available here.  This video specifically focuses on post-disaster assessments in the Philippines.

“Once in the field, the team verifies the flight plans before the UAV is flown by performing a pre-flight survey [which] may be done through ground observations of the area, use of local knowledge or short range aerial observations with a rotary UAV to identify launch/recovery sites and terrain characteristics. This may lead to adjustment in the flight plans. After the flight plans have been verified, the UAV is deployed for data acquisition.”

Screen Shot 2014-10-03 at 11.27.33 AM

Matthew, Charles and team initially used a Micropilot MP-Vision UAV for data acquisition. “However, due to increased cost of maintenance and significant skill requirements of setting up the MP-Vision,” they developed their own custom UAV instead, which “uses semi-professional and hobby- grade components combined with open-source software” as depicted in the above figure (click to enlarge). “The UAV’s airframe is the Super SkySurfer fixed-wing EPO foam frame.” The team used the “ArduPilot Mega (APM) autopilot system consisting of an Arduino-based microprocessor board, airspeed sensor, pressure and tem-perature sensor, GPS module, triple-axis gyro and other sensors. The firmware for navigation and control is open-source.”

The custom UAV, which costs approximately $2,000, has “an endurance of about 30-50 minutes, depending on payload weight and wind conditions, and is able to survey an area of up to 4 square kilometers.” The custom platform was “easier to assemble, repair, maintain, modify & use. This allowed faster deploy-ability of the UAV. In addition, since the autopilot firmware is open-source, with a large community of developers supporting it, it became easier to identify and address issues and obtain software updates.” That said, the custom UAV was “more prone to hardware and software errors, either due to assembly of parts, wiring of electronics or bugs in the software code.” Despite these drawbacks, “use of the custom UAV turned out to be more feasible and cost effective than use of a commercial-grade UAV.”

In terms of payloads (cameras), three different kinds were used: Panasonic Lumix LX3, Canon S100, and GoPro Hero 3. These cameras come with both advantages and disadvantages for aerial mapping. The LX3 has better image quality but the servo triggering the shutter would often fail. The S100 is GPS-enabled and does not require mechanical triggering. The Hero-3 was used for video reconnaissance specifically.

Screen Shot 2014-10-04 at 5.31.47 AM

“The workflow at [the Data-Processing] stage focuses on the creation of an orthomosaic—an orthorectified, georeferenced and stitched map derived from aerial images and GPS and IMU (inertial measurement unit values, particularly yaw, pitch and roll) information.” In other words, “orthorectification is the process of stretching the image to match the spatial accuracy of a map by considering location, elevation, and sensor information.”

Transforming aerial images into orthomosaics involves: (1) manually removing take-off/landing, burry & oblique images; (2) applying contrast enhancement to images that are either over- or under-exposed using commercial image-editing software; (3) geo-referencing the resulting images; (4) creating an orthomosaic from the geo-tagged images. The geo-referencing step is not needed if the images are already geo-referenced (i.e., have GPS coordinates, like those taken with the Cannon S100. “For non-georeferenced images, georeferencing is done by a custom Python script that generates a CSV file containing the mapping between images and GPS/IMU information. In this case, the images are not embedded with GPS coordinates.” The sample orthomosaic above uses 785 images taken during two UAV flights (click to enlarge).

Matthew, Charles and team used the “Pix4Dmapper photomapping software developed by Pix4D to render their orthomosaics. “The program can use either geotagged or non-geotagged images. For non-geotagged images, the software accepts other inputs such as the CSV file generated by the custom Python script to georeference each image and generate the photomosaic. Pix4D also outputs a report containing information about the output, such as total area covered and ground resolution. Quantum GIS, an open-source GIS software, was used for annotating and viewing the photomosaics, which can sometimes be too large to be viewed using common photo viewing software.”

Screen Shot 2014-10-03 at 11.28.20 AM

Data Delivery involves uploading the orthomosaics to a common, web-based platform that stakeholders can access. Orthomosaics “generally have large file sizes (e.g around 300MB for a 2 sq. km. render),” so the team created a web-based geographic information systems (GIS) to facilitate sharing of aerial maps. “The platform, named VEDA, allows viewing of rendered maps and adding metadata. The key advantage of using this platform is that the aerial imagery data is located in one place & can be accessed from any computer with a modern Internet browser. Before orthomosaics can be uploaded to the VEDA platform, they need to be converted into an approprate format supported by the platform. The current format used is MBTiles developed by Mapbox. The MBTiles format specifies how to partition a map image into smaller image tiles for web access. Once uploaded, the orthomosaic map can then be annotated with additional information, such as markers for points of interest.” The screenshot above (click to enlarge) shows the layout of a rendered orthomosaic in VEDA.

Matthew, Charles and team have applied the above workflow in various mission-critical UAV projects in the Philippines including damage assessment work after Typhoon Haiyan in 2013. This also included assessing the impact of the Typhoon on agriculture, which was an ongoing concern for local government during the recovery efforts. “The coconut industry, in particular, which plays a vital role in the Philippine economy, was severely impacted due to millions of coconut trees being damaged or flattened after the storm hit. In order to get an accurate assessment of the damage wrought by the typhoon, and to make a decision on the scale of recovery assistance from national government, aerial imagery coupled with a ground survey is a potentially promising approach.”

So the team received permission from local government to fly several missions over areas in Eastern Visayas that [were] devoted to coconut stands prior to Typhoon Haiyan.” (As such, “The UAV field team operated mostly in rural areas and wilderness, which reduced the human risk factor in case of aircraft failure. Also, as a safety guideline, the UAV was not flown within 3 miles from an active airport”). The partners in the Philippines are developing image processing techniques to distinguish “coconut trees from wild forest and vegetation for land use assessment and carbon source and sink estimates. One technique involved use of superpixel classification, wherein the image pixels are divided into homogeneous regions (i.e. collection of similar pixels) called superpixels which serve as the basic unit for classification.”

Screen Shot 2014-10-03 at 11.29.07 AM

The image below shows the “results of the initial test run where areas containing coconut trees [above] have been segmented.”

Screen Shot 2014-10-03 at 11.29.23 AM

“Similar techniques could also be used for crop damage assessment after a disaster such as Typhoon Haiyan, where for example standing coconut trees could be distinguished from fallen ones in order to determine capacity to produce coconut-based products.” This is an area that my team and I at QCRI are exploring in partnership with Matthew, Charles and company. In particular, we’re interested in assessing whether crowdsourcing can be used to facilitate the development of machine learning classifiers for image feature detection. More on this herehere and on CNN here. In addition, since “aerial imagery augmented with ground observations would provide a richer source of informa-tion than either one could provide alone,” we are also exploring the integration of social media data with aerial imagery (as described here).

In conclusion, Matthew, Charles and team are looking to further develop the above framework by automating more processes, “such as image filtering and image contrast enhancement. Autonomous take-off & landing will be configured for the custom UAV in order to reduce the need for a skilled pilot. A catapult system will be created for the UAV to launch in areas with a small clearing and a parachute system will be added in order to reduce the risk of damage due to belly landings.” I very much look forward to following the team’s progress and to collaborating with them on imagery analysis for disaster response.

bio

See Also:

  • Official UN Policy Brief on Humanitarian UAVs [link]
  • Common Misconceptions About Humanitarian UAVs [link]
  • Humanitarians in the Sky: Using UAVs for Disaster Response [link]
  • Humanitarian UAVs Fly in China After Earthquake [link]
  • Humanitarian UAV Missions During Balkan Floods [link]
  • Humanitarian UAVs in the Solomon Islands [link]
  • UAVs, Community Mapping & Disaster Risk Reduction in Haiti [link]

Analyzing Tweets on Malaysia Flight #MH370

My QCRI colleague Dr. Imran is using our AIDR platform (Artificial Intelligence for Disaster Response) to collect & analyze tweets related to Malaysia Flight 370 that went missing several days ago. He has collected well over 850,000 English-language tweets since March 11th; using the following keywords/hashtags: Malaysia Airlines flight, #MH370m #PrayForMH370 and #MalaysiaAirlines.

MH370 Prayers

Imran then used AIDR to create a number of “machine learning classifiers” to automatically classify all incoming tweets into categories that he is interested in:

  • Informative: tweets that relay breaking news, useful info, etc

  • Praying: tweets that are related to prayers and faith

  • Personal: tweets that express personal opinions

The process is super simple. All he does is tag several dozen incoming tweets into their respective categories. This teaches AIDR what an “Informative” tweet should “look like”. Since our novel approach combines human intelligence with artificial intelligence, AIDR is typically far more accurate at capturing relevant tweets than Twitter’s keyword search.

And the more tweets that Imran tags, the more accurate AIDR gets. At present, AIDR can auto-classify ~500 tweets per second, or 30,000 tweets per minute. This is well above the highest velocity of crisis tweets recorded thus far—16,000 tweets/minute during Hurricane Sandy.

The graph below depicts the number of tweets generated since the day we started collecting the AIDR collection, i.e., March 11th.

Volume of Tweets per Day

This series of pie charts simply reflects the relative share of tweets per category over the past four days.

Tweets Trends

Below are some of the tweets that AIDR has automatically classified as being Informative (click to enlarge). The “Confidence” score simply reflects how confident AIDR is that it has correctly auto-classified a tweet. Note that Imran could also have crowdsourced the manual tagging—that is, he could have crowdsourced the process of teaching AIDR. To learn more about how AIDR works, please see this short overview and this research paper (PDF).

AIDR output

If you’re interested in testing AIDR (still very much under development) and/or would like the Tweet ID’s for the 850,000+ tweets we’ve collected using AIDR, then feel free to contact me. In the meantime, we’ll start a classifier that auto-collects tweets related to hijacking, criminal causes, and so on. If you’d like us to create a classifier for a different topic, let us know—but we can’t make any promises since we’re working on an important project deadline. When we’re further along with the development of AIDR, anyone will be able to easily collect & download tweets and create & share their own classifiers for events related to humanitarian issues.

Bio

Acknowledgements: Many thanks to Imran for collecting and classifying the tweets. Imran also shared the graphs and tabular output that appears above.

Inferring International and Internal Migration Patterns from Twitter

My QCRI colleagues Kiran Garimella and Ingmar Weber recently co-authored an important study on migration patterns discerned from Twitter. The study was co-authored with  Bogdan State (Stanford)  and lead author Emilio Zagheni (CUNY). The authors analyzed 500,000 Twitter users based in OECD countries between May 2011 and April 2013. Since Twitter users are not representative of the OECD population, the study uses a “difference-in-differences” approach to reduce selection bias when in out-migration rates for individual countries. The paper is available here and key insights & results are summarized below.

Twitter Migration

To better understand the demographic characteristics of the Twitter users under study, the authors used face recognition software (Face++) to estimate both the gender and age of users based on their profile pictures. “Face++ uses computer vision and data mining techniques applied to a large database of celebrities to generate estimates of age and sex of individuals from their pictures.” The results are depicted below (click to enlarge). Naturally, there is an important degree of uncertainty about estimates for single individuals. “However, when the data is aggregated, as we did in the population pyramid, the uncertainty is substantially reduced, as overestimates and underestimates of age should cancel each other out.” One important limitation is that age estimates may still be biased if users upload younger pictures of themselves, which would result in underestimating the age of the sample population. This is why other methods to infer age (and gender) should also be applied.

Twitter Migration 3

I’m particularly interested in the bias-correction “difference-in-differences” method used in this study, which demonstrates one can still extract meaningful information about trends even though statistical inferences cannot be inferred since the underlying data does not constitute a representative sample. Applying this method yields the following results (click to enlarge):

Twitter Migration 2

The above graph reveals a number of interesting insights. For example, one can observe a decline in out-migration rates from Mexico to other countries, which is consistent with recent estimates from Pew Research Center. Meanwhile, in Southern Europe, the results show that out-migration flows continue to increase for  countries that were/are hit hard by the economic crisis, like Greece.

The results of this study suggest that such methods can be used to “predict turning points in migration trends, which are particularly relevant for migration forecasting.” In addition, the results indicate that “geolocated Twitter data can substantially improve our understanding of the relationships between internal and international migration.” Furthermore, since the study relies in publicly available, real-time data, this approach could also be used to monitor migration trends on an ongoing basis.

To which extent the above is feasible remains to be seen. Very recent mobility data from official statistics are simply not available to more closely calibrate and validate the study’s results. In any event, this study is an important towards addressing a central question that humanitarian organizations are also asking: how can we make statistical inferences from online data when ground-truth data is unavailable as a reference?

I asked Emilio whether techniques like “difference-in-differences” could be used to monitor forced migration. As he noted, there is typically little to no ground truth data available in humanitarian crises. He thus believes that their approach is potentially relevant to evaluate forced migration. That said, he is quick to caution against making generalizations. Their study focused on OECD countries, which represent relatively large samples and high Internet diffusion, which means low selection bias. In contrast, data samples for humanitarian crises tend to be far smaller and highly selected. This means that filtering out the bias may prove more difficult. I hope that this is a challenge that Emilio and his co-authors choose to take on in the near future.

bio

#Westgate Tweets: A Detailed Study in Information Forensics

My team and I at QCRI have just completed a detailed analysis of the 13,200+ tweets posted from one hour before the attacks began until two hours into the attack. The purpose of this study, which will be launched at CrisisMappers 2013 in Nairobi tomorrow, is to make sense of the Big (Crisis) Data generated during the first hours of the siege. A summary of our results are displayed below. The full results of our analysis and discussion of findings are available as a GoogleDoc and also PDF. The purpose of this public GoogleDoc is to solicit comments on our methodology so as to inform the next phase of our research. Indeed, our aim is to categorize and study the entire Westgate dataset in the coming months (730,000+ tweets). In the meantime, sincere appreciation go to my outstanding QCRI Research Assistants, Ms. Brittany Card and Ms. Justine MacKinnon for their hard work on the coding and analysis of the 13,200+ tweets. Our study builds on this preliminary review.

The following 7 figures summarize the main findings of our study. These are discussed in more detail in the GoogleDoc/PDF.

Figure 1: Who Authored the Most Tweets?

Figure 2: Frequency of Tweets by Eyewitnesses Over Time?

Figure 3: Who Were the Tweets Directed At?

Figure 4: What Content Did Tweets Contain?

Figure 5: What Terms Were Used to Reference the Attackers?

Figure 6: What Terms Were Used to Reference Attackers Over Time?

Figure 7: What Kind of Multimedia Content Was Shared?

Hashtag Analysis of #Westgate Crisis Tweets

In July 2013, my team and I at QCRI launched this dashboard to analyze hashtags used by Twitter users during crises. Our first case study, which is available here, focused on Hurricane Sandy. Since then, both the UN and Greenpeace have also made use of the dashboard to analyze crisis tweets.

QCRI_Dashboard

We just uploaded 700,000+ Westgate related tweets to the dashboard. The results are available here and also displayed above. The dashboard is still under development, so we very much welcome feedback on how to improve it for future analysis. You can upload your own tweets to the dashboard if you’d like to test drive the platform.

Bio

See also: Forensics Analysis of #Westgate Tweets (Link)

Analyzing Crisis Hashtags on Twitter (Updated)

Update: You can now upload your own tweets to the Crisis Hashtags Analysis Dashboard here

Hashtag footprints can be revealing. The map below, for example, displays the top 200 locations in the world with the most Twitter hashtags. The top 5 are Sao Paolo, London, Jakarta, Los Angeles and New York.

Hashtag map

A recent study (PDF) of 2 billion geo-tagged tweets and 27 million unique hashtags found that “hashtags are essentially a local phenomenon with long-tailed life spans.” The analysis also revealed that hashtags triggered by external events like disasters “spread faster than hashtags that originate purely within the Twitter network itself.” Like other metadata, hashtags can be  informative in and of themselves. For example, they can provide early warning signals of social tensions in Egypt, as demonstrated in this study. So might they also reveal interesting patterns during and after major disasters?

Tens of thousands of distinct crisis hashtags were posted to Twitter during Hurricane Sandy. While #Sandy and #hurricane featured most, thousands more were also used. For example: #SandyHelp, #rallyrelief, #NJgas, #NJopen, #NJpower, #staysafe, #sandypets, #restoretheshore, #noschool, #fail, etc. NJpower, for example, “helped keep track of the power situation throughout the state. Users and news outlets used this hashtag to inform residents where power outages were reported and gave areas updates as to when they could expect their power to come back” (1).

Sandy Hashtags

My colleagues and I at QCRI are studying crisis hashtags to better understand the variety of tags used during and in the immediate aftermath of major crises. Popular hashtags used during disasters often overshadow more hyperlocal ones making these less discoverable. Other challenges include the: “proliferation of hashtags that do not cross-pollinate and a lack of usability in the tools necessary for managing massive amounts of streaming information for participants who needed it” (2). To address these challenges and analyze crisis hashtags, we’ve just launched a Crisis Hashtags Analytics Dashboard. As displayed below, our first case study is Hurricane Sandy. We’ve uploaded about half-a-million tweets posted between October 27th to November 7th, 2012 to the dashboard.

QCRI_Dashboard

Users can visualize the frequency of tweets (orange line) and hashtags (green line) over time using different time-steps, ranging from 10 minute to 1 day intervals. They can also “zoom in” to capture more minute changes in the number of hashtags per time interval. (The dramatic drop on October 30th is due to a server crash. So if you have access to tweets posted during those hours, I’d be  grateful if you could share them with us).

Hashtag timeline

In the second part of the dashboard (displayed below), users can select any point on the graph to display the top “K” most frequent hashtags. The default value for K is 10 (e.g., top-10 most frequent hashtags) but users can change this by typing in a different number. In addition, the 10 least-frequent hashtags are displayed, as are the 10 “middle-most” hashtags. The top-10 newest hashtags posted during the selected time are also displayed as are the hashtags that have seen the largest increase in frequency. These latter two metrics, “New K” and “Top Increasing K”, may provide early warning signals during disasters. Indeed, the appearance of a new hashtag can reveal a new problem or need while a rapid increase in the frequency of some hashtags can denote the spread of a problem or need.

QCRI Dashboard 2

The third part of the dashboard allows users to visualize and compare the frequency of top hashtags over time. This feature is displayed in the screenshot below. Patterns that arise from diverging or converging hashtags may indicate important developments on the ground.

QCRI Dashboard 3

We’re only at the early stages of developing our hashtags analytics platform (above), but we hope the tool will provide insights during future disasters. For now, we’re simply experimenting and tinkering. So feel free to get in touch if you would like to collaborate and/or suggest some research questions.

Bio

Acknowledgements: Many thanks to QCRI colleagues Ahmed Meheina and Sofiane Abbar for their work on developing the dashboard.