PROJECTS

Here are active projects in DSNS Lab. Some more are underway and will be added here as we progress.

DISCO-Proc: Distributed Storage, Near-Optimal Coding and Protocol Design for Data Caching in Cellular Networks [Supported by TUBITAK 1001 – 119E235, PI: Suayb S. Arslan]

In cellular networks, with the caching of popular files in devices, the communication between devices significantly reduces the communication load on the base station (BS). This can be accomplished by distributing the partitions of a popular file to mobile devices either in an uncoded or coded form using erasure coding. Any piece of the chunked content can either be retrieved from the neighboring mobile devices or, if not possible, from the BS directly, at the expense of a higher communication cost. Considering a cellular network, in which a cell contains a set of nodes, some of which are arriving and some are departing at random times, intelligent data repair methods will be needed to ensure a minimum level of communication with the base station (BS). Involvement of a BS or more than one BS adds another dimension to previous repair paradigms, especially to the cooperative repair process due to the changes in the set of constraints and operating protocol rules. There is no work which studies the bandwidth/storage trade-off for this particular case. Accordingly, novel cell architectures will call for different design considerations including but not limited to novel code constructions, protocol designs, data access latency, realistic queuing models and simulation platforms. In this project, unlike the previous research, we initially aim to obtain improved theoretical bounds for the bandwidth and storage capacity using data flow diagrams while BSs are cooperatively helping to repair the lost data. In addition, inspired by the code structures that utilize bandwidth and storage space efficiently, completely genuine graph-based code constructions as well as novel repair algorithms on these codes will be proposed to achieve near-optimality. Novel approaches such as genetic algorithm and optimal left-over data distributions shall be proposed that have not previously been applied to the node repair problem in literature. Besides that, highly novel protocol designs based on energy efficiency for check-in and check-out processes, will be proposed, which will help minimize the bandwidth and data storage requirements. These protocols will be strengthened by transition (hand-off) scenarios, which will allow nodes to migrate from one cell to another, while enabling the BSs to collaborate and help effectively use the intracellular resources. This involves few novelties such as adjusting repair intervals (thresholds), reducing data access overheads, the use of incoming node contents, intelligent left-over data handling etc. Finally, various known and more realistic queuing models will be used to analytically evaluate the proposed code structures and the performance of the protocol architecture. In order to verify our analytical results, large scale cellular network simulations will be performed in order to numerically derive overall communication and file maintenance cost as well as comparisons.


Development and Testing of Concurrent Algorithms to Solve Mixed Integer Linear Programming Problems in Distributed Memory Systems [Supported by TUBITAK 3501 – 119E235, PI: UTKU KOC]

The clock speed of the high-tech processors is more or less stable for the past decade. Computer technology is now mainly focused on increasing the number of processors and memory. With this in mind, it is inevitable that algorithms and methods used in parallel processors in distributed memory systems are the trend in all areas. The number of processors on personal computers is rapidly increasing and the opportunities offered by businesses such as Amazon and Google have made distributed memory systems available for enterprise and even individual use where concurrent algorithms can be run on cloud systems. From a practical point of view, it is important to solve a problem or identify a good solution within a reasonable amount of wall-clock time, de-emphasizing the CPU-time used. Mixed Integer Linear Programming (MILP) models are used in many applications such as production planning, energy, distribution, health, telecommunication, and logistics. The size and variety of the data generated and stored every day on a global scale is increasing and the analysis of big data becomes more important. With the increasing amount of data, larger MILP models need to be solved. In recent years, the development of linear programming problems has been limited in both computational and methodological considerations, but the development in MILP models has been even further (Lodi, 2010; Gleixner, 2019). It is necessary to develop MILP solution methods that work concurrently in distributed memory systems using up-to-date technologies. There is no published work on heuristic and cutting plane methods implemented in distributed memory systems. This project is the first attempt for concurrently running alternative heuristics and cutting plane algorithms in parallel. This will give a head start for the principal researchers career, leading to follow-up projects. This project aims to develop an MILP solution framework that can solve MILP problems in distributed memory systems, where subroutines run concurrently, starting from different points, and asynchronously share information amongst themselves. This framework needs to be robust and stable against problems that can occur in the distributed memory environment. The proposed concurrent optimization framework will allow simultaneous operation of both heuristic and cutting plane algorithms. The effects of starting from different points, information sharing and concurrent implementation will be identified, and the scalability of methods will be tested for the first time in the literature. Having researchers from the computer science and industrial engineering departments will lead an increase in the interdisciplinary working abilities for the research team. The principal researcher will gain skills in parallel programming and cutting plane algorithms. The researcher will acquire new skills in optimization. The effect of the project is twofold: 1) The research team is planning to share the results via 2-3 papers in SCI or SCI-Exp indexed journals, 2-3 presentations at international and 2-3 presentations at national conferences. 2) A computational improvement in solving MILPs will lead an increase in solvable problem size and/or reduce solution times for relevant problems that academicians and practitioners face. In the proposed parallel optimization framework, both heuristics and cutting plane algorithms will start from linear programming relaxation as well as alternative random interior and vertex points near an optimal solution. The information generated in the subroutines will be asynchronously shared so that a valuable information detected in any subroutine will be used collectively by all subroutines. For heuristic and cutting plane methods, the effects of starting from different points, asynchronous information sharing and concurrent running will be determined independently of each other and the scalability of the method will be tested.


COLD STORAGE SYSTEM DESIGN AND RELIABILITY/AVAILABILITY MODELLING by Quantum CORPORATION, [PI: SUAYB S. ARSLAN]

Power efficiency and renewable energy are at the forefront for any technology including but not limitted to digital long-term data storage. As the growth of data has seen unprecedented increase, the storage costs of giant data centers multiplied. Hence, cold storage has emerged as the new paradigm to store digital data without much energy spent. As each new technology requires novel techniques to format, protect and make it available to user, cold data storage (particularly for archival use cases) has its own peculiar requirements. Looking forward, every country must think about and strategize their data future, and storage lays at the heart of this agenda. In this project, we lay foundations for each and every aspect of digital cold data storage based on Tape and DNA systems. We propose coding strategies, algorithms, detectors, reliability and availability models, accessibility strategies, etc. to be able to embrace green data storage opportunities for power-hungry future. This project also considers advance discrete time simulations for numerically modelling Tape/optical disk/DNA libraries as well as theoretical Markov modelling to estimate availability in scale-out settings. This project has been supported by Quantum Corporation, Irvine, CA between 2016-2022.