Graduation Year

2023

Document Type

Dissertation

Degree

Ph.D.

Degree Name

Doctor of Philosophy (Ph.D.)

Degree Granting Department

Computer Science and Engineering

Major Professor

Sriram Chellappan, Ph.D.

Committee Member

Srinivas Katkoori, Ph.D.

Committee Member

Mehran Mozaffari Kermani, Ph.D.

Committee Member

James Stock, Ph.D.

Committee Member

Stephen Saddow, Ph.D.

Keywords

311, Garbage detection, Needle detection, Object detection, Transfer learning

Abstract

Smart governance is an area, that is increasingly becoming important, not only in advanced countries, but all across the globe. Thanks to global scale network connectivity, permeance of smart-devices of various form-factors, and overall improvement in digital literary, we are now seeing "smartness" everywhere, or if not, the general public is expecting the same. Ultimately, the goal of smart governance is to facilitate state-of-the-art technologies to improve citizens’ lives. With the ubiquity of smart phone technologies today, citizens more readily participate in collaboration with public officials for improved quality of life and their communities. By utilizing optimal tools, public officials can maximize the benefits from citizen participation and leverage their insights to make informed decisions. With more responsive systems, public officials can increase citizen satisfaction and engagement encouraging greater participation. This will lead to more effective and collaborative governance. Also, adoption of new technologies can demonstrate a commitment to innovation and continual improvement, demonstrating to citizens that their local government is proactive and forward-thinking.

311 service is popular in the US, and is available to many cities for reporting non-emergent civic related issues around their communities. 311 based systems are widely used for reporting things like overflowing garbage, illegal dumping, evidence of drug activity on streets, medical waste and graffiti. Previously, the 311 service was primarily available through phone calls. However, the 311 service has expanded to include various channels as technology has advanced. Citizens can now contact the 311 service using a variety of channels such as a mobile app, a website, and Twitter. With the popularity of 311 mobile app, more and more citizens take photos and report issues with just a few clicks using the mobile app. The overall goal of this dissertation is the design, deployment and validation of computer vision technologies to solve problems of critical importance in smart governance. Specifically, our focus is to automate the detection and classification of garbage on streets, and detecting drug use. Both problems are of not only national importance, but global too.

For this dissertation, we have collected photos on San Francisco 311 which were taken and uploaded by citizens of San Francisco. Using the image data from San Francisco 311, we present the computer vision techniques to detect and localize objects of interests automatically. These images are citizen-generated, so they vary in resolutions and angles containing objects of different sizes. Also, they contain backgrounds from various locations in San Francisco. Utilizing a diverse range of images contributes to building a more stable and robust object detection model. we were able to compare and evaluate different convolutional neural network object detection models to identify different objects related to 311 service categories.

We first observed and compared 311 datasets from three major cities in the US, namely Los Angeles, San Francisco and Boston because they are major cities known for their successful 311 services. we found out that garbage issue was one of the major issues across all three cities. Also, we discovered that San Francisco 311 publicly shares its image data. Then, we conducted statistical analysis on the San Francisco 311 dataset. we plotted 311 requests on the San Francisco map. we observed that "Street and Sidewalk Cleaning" was the most reported 311 request service which took up 36% of all 311 requests. The "Street and Sidewalk Cleaning" service was comprised of the following service detail categories: "Other Loose Garbage" accounting for the highest at 12% followed by "Furniture" at 5% as the second highest. Then, we performed garbage object detection on the San Francisco 311 image data using two pre-trained models, namely Faster R-CNN and RetinaNet. The models detected the following four classes: garbage, cardboard box, garbage can and garbage can overflow. RetinaNet outperformed Faster R-CNN with mAP = 0.87.

Next, we tackled another important worldwide social problem, which is detecting illegal drug use. For centuries, illegal drug use has been a major problem in society globally despite a lot of effort and resources invested to combat the issue. In the United States, drug overdose is the leading cause of injury-related deaths which caused CDC to declare drug overdose over America as an epidemic. One of the critical challenges in the war against drugs is that people who inject drugs discard syringes and drug-related objects in public leading to many civic problems not only due to increased drug use but also threat to public safety and health. Additionally, the primary cause of blood-borne infectious diseases such as hepatitis C, hepatitis B and HIV is sharing contaminated injecting instruments such as syringes. We explored the possibility of using crowd support to combat illegal drug use by performing drug-related object detection using three pre-trained models, Faster R-CNN, EfficientDet and YOLO v5. The models detected the following nine classes: "biohazard box", "cooker", "orange cap", "plunger", "syringe", "syringe packaging", "tourniquet", "vial" and "white cap". YOLO v5 performed the best with mAP = 0.64 on the test dataset.

In addition, we have investigated hyperparameters for these convolutional neural network object detection models, namely Faster R-CNN, RetinaNet, EfficientDet and YOLO v5. Hyperparameters are settings to run the object detection models effectively during training. Thus, setting hyperparameters appropriately for different tasks is imperative to develop efficient and high performing inference models. We have found that there are common hyperparameters among models, such as the learning rate, the batch size and the number of epochs. Also, we have found each object detection model can accept a set of different configurations. For example, the Faster R-CNN model allows to pick an optimizer among RMSProp optimizer, Momentum optimizer, and Adam optimizer. The RetinaNet model can be configured with the anchor config file, which accepts the anchor sizes, strides, ratios and scales in each line. YOLO v5 utilizes a YAML file to set hyperparameters. A YAML file is a text-based file often used for configuration.

Our evaluation results show that our system can be an effective tool for next generation smart governance systems for identifying garbage and drug-related objects in the public domain. Moreover, our work may have global impact as many parts of the world suffer many of similar social issues. We are currently a web-based demonstration of our work, and in the future, hope to engage civic officials in deploying our technologies for real use.

Share

COinS