new

Get trending papers in your email inbox!

Subscribe

Daily Papers

byAK and the research community

Apr 17

Recoding latent sentence representations -- Dynamic gradient-based activation modification in RNNs

In Recurrent Neural Networks (RNNs), encoding information in a suboptimal or erroneous way can impact the quality of representations based on later elements in the sequence and subsequently lead to wrong predictions and a worse model performance. In humans, challenging cases like garden path sentences (an instance of this being the infamous "The horse raced past the barn fell") can lead their language understanding astray. However, they are still able to correct their representation accordingly and recover when new information is encountered. Inspired by this, I propose an augmentation to standard RNNs in form of a gradient-based correction mechanism: This way I hope to enable such models to dynamically adapt their inner representation of a sentence, adding a way to correct deviations as soon as they occur. This could therefore lead to more robust models using more flexible representations, even during inference time. I conduct different experiments in the context of language modeling, where the impact of using such a mechanism is examined in detail. To this end, I look at modifications based on different kinds of time-dependent error signals and how they influence the model performance. Furthermore, this work contains a study of the model's confidence in its predictions during training and for challenging test samples and the effect of the manipulation thereof. Lastly, I also study the difference in behavior of these novel models compared to a standard LSTM baseline and investigate error cases in detail to identify points of future research. I show that while the proposed approach comes with promising theoretical guarantees and an appealing intuition, it is only able to produce minor improvements over the baseline due to challenges in its practical application and the efficacy of the tested model variants.

  • 1 authors
·
Jan 3, 2021

Empowering Agricultural Insights: RiceLeafBD - A Novel Dataset and Optimal Model Selection for Rice Leaf Disease Diagnosis through Transfer Learning Technique

The number of people living in this agricultural nation of ours, which is surrounded by lush greenery, is growing on a daily basis. As a result of this, the level of arable land is decreasing, as well as residential houses and industrial factories. The food crisis is becoming the main threat for us in the upcoming days. Because on the one hand, the population is increasing, and on the other hand, the amount of food crop production is decreasing due to the attack of diseases. Rice is one of the most significant cultivated crops since it provides food for more than half of the world's population. Bangladesh is dependent on rice (Oryza sativa) as a vital crop for its agriculture, but it faces a significant problem as a result of the ongoing decline in rice yield brought on by common diseases. Early disease detection is the main difficulty in rice crop cultivation. In this paper, we proposed our own dataset, which was collected from the Bangladesh field, and also applied deep learning and transfer learning models for the evaluation of the datasets. We elaborately explain our dataset and also give direction for further research work to serve society using this dataset. We applied a light CNN model and pre-trained InceptionNet-V2, EfficientNet-V2, and MobileNet-V2 models, which achieved 91.5% performance for the EfficientNet-V2 model of this work. The results obtained assaulted other models and even exceeded approaches that are considered to be part of the state of the art. It has been demonstrated by this study that it is possible to precisely and effectively identify diseases that affect rice leaves using this unbiased datasets. After analysis of the performance of different models, the proposed datasets are significant for the society for research work to provide solutions for decreasing rice leaf disease.

  • 6 authors
·
Jan 15, 2025

Arboretum: A Large Multimodal Dataset Enabling AI for Biodiversity

We introduce Arboretum, the largest publicly accessible dataset designed to advance AI for biodiversity applications. This dataset, curated from the iNaturalist community science platform and vetted by domain experts to ensure accuracy, includes 134.6 million images, surpassing existing datasets in scale by an order of magnitude. The dataset encompasses image-language paired data for a diverse set of species from birds (Aves), spiders/ticks/mites (Arachnida), insects (Insecta), plants (Plantae), fungus/mushrooms (Fungi), snails (Mollusca), and snakes/lizards (Reptilia), making it a valuable resource for multimodal vision-language AI models for biodiversity assessment and agriculture research. Each image is annotated with scientific names, taxonomic details, and common names, enhancing the robustness of AI model training. We showcase the value of Arboretum by releasing a suite of CLIP models trained using a subset of 40 million captioned images. We introduce several new benchmarks for rigorous assessment, report accuracy for zero-shot learning, and evaluations across life stages, rare species, confounding species, and various levels of the taxonomic hierarchy. We anticipate that Arboretum will spur the development of AI models that can enable a variety of digital tools ranging from pest control strategies, crop monitoring, and worldwide biodiversity assessment and environmental conservation. These advancements are critical for ensuring food security, preserving ecosystems, and mitigating the impacts of climate change. Arboretum is publicly available, easily accessible, and ready for immediate use. Please see the https://baskargroup.github.io/Arboretum/{project website} for links to our data, models, and code.

  • 15 authors
·
Jun 25, 2024 1

It is not always greener on the other side: Greenery perception across demographics and personalities in multiple cities

Quantifying and assessing urban greenery is consequential for planning and development, reflecting the everlasting importance of green spaces for multiple climate and well-being dimensions of cities. Evaluation can be broadly grouped into objective (e.g., measuring the amount of greenery) and subjective (e.g., polling the perception of people) approaches, which may differ -- what people see and feel about how green a place is might not match the measurements of the actual amount of vegetation. In this work, we advance the state of the art by measuring such differences and explaining them through human, geographic, and spatial dimensions. The experiments rely on contextual information extracted from street view imagery and a comprehensive urban visual perception survey collected from 1,000 people across five countries with their extensive demographic and personality information. We analyze the discrepancies between objective measures (e.g., Green View Index (GVI)) and subjective scores (e.g., pairwise ratings), examining whether they can be explained by a variety of human and visual factors such as age group and spatial variation of greenery in the scene. The findings reveal that such discrepancies are comparable around the world and that demographics and personality do not play a significant role in perception. Further, while perceived and measured greenery correlate consistently across geographies (both where people and where imagery are from), where people live plays a significant role in explaining perceptual differences, with these two, as the top among seven, features that influences perceived greenery the most. This location influence suggests that cultural, environmental, and experiential factors substantially shape how individuals observe greenery in cities.

  • 12 authors
·
Dec 18, 2025

The 'Paris-end' of town? Urban typology through machine learning

The confluence of recent advances in availability of geospatial information, computing power, and artificial intelligence offers new opportunities to understand how and where our cities differ or are alike. Departing from a traditional `top-down' analysis of urban design features, this project analyses millions of images of urban form (consisting of street view, satellite imagery, and street maps) to find shared characteristics. A (novel) neural network-based framework is trained with imagery from the largest 1692 cities in the world and the resulting models are used to compare within-city locations from Melbourne and Sydney to determine the closest connections between these areas and their international comparators. This work demonstrates a new, consistent, and objective method to begin to understand the relationship between cities and their health, transport, and environmental consequences of their design. The results show specific advantages and disadvantages using each type of imagery. Neural networks trained with map imagery will be highly influenced by the mix of roads, public transport, and green and blue space as well as the structure of these elements. The colours of natural and built features stand out as dominant characteristics in satellite imagery. The use of street view imagery will emphasise the features of a human scaled visual geography of streetscapes. Finally, and perhaps most importantly, this research also answers the age-old question, ``Is there really a `Paris-end' to your city?''.

  • 5 authors
·
Oct 8, 2019

OpenAI GPT-5 System Card

This is the system card published alongside the OpenAI GPT-5 launch, August 2025. GPT-5 is a unified system with a smart and fast model that answers most questions, a deeper reasoning model for harder problems, and a real-time router that quickly decides which model to use based on conversation type, complexity, tool needs, and explicit intent (for example, if you say 'think hard about this' in the prompt). The router is continuously trained on real signals, including when users switch models, preference rates for responses, and measured correctness, improving over time. Once usage limits are reached, a mini version of each model handles remaining queries. This system card focuses primarily on gpt-5-thinking and gpt-5-main, while evaluations for other models are available in the appendix. The GPT-5 system not only outperforms previous models on benchmarks and answers questions more quickly, but -- more importantly -- is more useful for real-world queries. We've made significant advances in reducing hallucinations, improving instruction following, and minimizing sycophancy, and have leveled up GPT-5's performance in three of ChatGPT's most common uses: writing, coding, and health. All of the GPT-5 models additionally feature safe-completions, our latest approach to safety training to prevent disallowed content. Similarly to ChatGPT agent, we have decided to treat gpt-5-thinking as High capability in the Biological and Chemical domain under our Preparedness Framework, activating the associated safeguards. While we do not have definitive evidence that this model could meaningfully help a novice to create severe biological harm -- our defined threshold for High capability -- we have chosen to take a precautionary approach.

  • 484 authors
·
Dec 19, 2025

Governance of the AI, by the AI, and for the AI

Over the past half century, there have been several false dawns during which the "arrival" of world-changing artificial intelligence (AI) has been heralded. Tempting fate, the authors believe the age of AI has, indeed, finally arrived. Powerful image generators, such as DALL-E2 and Midjourney have suddenly allowed anyone with access the ability easily to create rich and complex art. In a similar vein, text generators, such as GPT3.5 (including ChatGPT) and BLOOM, allow users to compose detailed written descriptions of many topics of interest. And, it is even possible now for a person without extensive expertise in writing software to use AI to generate code capable of myriad applications. While AI will continue to evolve and improve, probably at a rapid rate, the current state of AI is already ushering in profound changes to many different sectors of society. Every new technology challenges the ability of humanity to govern it wisely. However, governance is usually viewed as both possible and necessary due to the disruption new technology often poses to social structures, industries, the environment, and other important human concerns. In this article, we offer an analysis of a range of interactions between AI and governance, with the hope that wise decisions may be made that maximize benefits and minimize costs. The article addresses two main aspects of this relationship: the governance of AI by humanity, and the governance of humanity by AI. The approach we have taken is itself informed by AI, as this article was written collaboratively by the authors and ChatGPT.

  • 2 authors
·
May 3, 2023

Local Rose Breeds Detection System Using Transfer Learning Techniques

Flower breed detection and giving details of that breed with the suggestion of cultivation processes and the way of taking care is important for flower cultivation, breed invention, and the flower business. Among all the local flowers in Bangladesh, the rose is one of the most popular and demanded flowers. Roses are the most desirable flower not only in Bangladesh but also throughout the world. Roses can be used for many other purposes apart from decoration. As roses have a great demand in the flower business so rose breed detection will be very essential. However, there is no remarkable work for breed detection of a particular flower unlike the classification of different flowers. In this research, we have proposed a model to detect rose breeds from images using transfer learning techniques. For such work in flowers, resources are not enough in image processing and classification, so we needed a large dataset of the massive number of images to train our model. we have used 1939 raw images of five different breeds and we have generated 9306 images for the training dataset and 388 images for the testing dataset to validate the model using augmentation. We have applied four transfer learning models in this research, which are Inception V3, ResNet50, Xception, and VGG16. Among these four models, VGG16 achieved the highest accuracy of 99%, which is an excellent outcome. Breed detection of a rose by using transfer learning methods is the first work on breed detection of a particular flower that is publicly available according to the study.

  • 3 authors
·
Apr 7, 2023

Are Work Zones and Connected Automated Vehicles Ready for a Harmonious Coexistence? A Scoping Review and Research Agenda

The recent advent of connected and automated vehicles (CAVs) is expected to transform the transportation system. CAV technologies are being developed rapidly and they are foreseen to penetrate the market at a rapid pace. On the other hand, work zones (WZs) have become common areas on highway systems as a result of the increasing construction and maintenance activities. The near future will therefore bring the coexistence of CAVs and WZs which makes their interaction inevitable. WZs expose all vehicles to a sudden and complex geometric change in the roadway environment, something that may challenge many of CAV navigation capabilities. WZs however also impose a space contraction resulting in adverse traffic impacts, something that legitimately calls for benefiting from the highly efficient CAV functions. CAVs should be able to reliably traverse WZ geometry and WZs should benefit from CAV intelligent functions. This paper reviews the state-of-the-art and the key concepts, opportunities, and challenges of deploying CAV systems at WZs. The reviewed subjects include traffic performance and behaviour, technologies and infrastructure, and regulatory considerations. Eighteen CAV mobility, safety, and environmental concepts and functions were distributed over the WZ area which was subdivided into five segments: further upstream, approach area, queuing area, WZ activity, and termination area. In addition, among other topics reviewed and discussed are detection of WZ features, smart traffic control devices, various technologies at connected WZs, cross-border harmonization, liability, insurance, and privacy. The paper also provides a research agenda with a list of research needs supported by experts rating and inputs. The paper aims to provide a bird eye view, but with necessary details that can benefit researchers, practitioners, and transportation agencies.

  • 2 authors
·
Jan 29, 2021