I hope it’s ok to ask for some feedback here. Of not, please let me know. The rules did not sound like against it

I’m very new to Rust. And while in general my coding background is ok, Rust still feels alien to me. I think I’m just still at “how to think in Rust” part of the curve.

So I would like to ask here for opinions on the following bit of code. I know that those unwrap are too optimistic for production, and I could figure out how to pass io::Error from the function all the way up to the shell. But what are other choices that I don’t see?
What would you write differently? What looks like Pythonisms/C++isms? Or is missing the mark completely?

use std::{fs, io};  
use std::path::PathBuf;  
use std::convert::TryFrom;  

use clap::Parser;  
use parquet::file::reader::SerializedFileReader;  
use parquet::record;  
use csv::WriterBuilder;  

#[derive(Debug, Parser)]  
#[command(version, about, long_about = None)]  
struct Args {  
    dir: String,  
    #[arg(default_value = "0")]  
    count: usize  
}  

fn get_files_in_dir(dir: &str) -> Option<Vec<PathBuf>>  
{  
    let dir = fs::read_dir(dir);  
    if dir.is_err() {  
        return None  
    };  
    let files = dir.unwrap()  
        .map(|res| res.map(|e| e.path()))  
        .collect::<Result<Vec<_>, _>>();  
    if files.is_err() {  
        return None  
    }  
    files.ok()  
}  

fn read_parquet_dir(entries: &Vec<String>) ->  impl Iterator<Item = record::Row> {  
    entries.iter()  
        .map(|p| SerializedFileReader::try_from(p.clone()).unwrap())  
        .flat_map(|r| r.into_iter())  
        .map(|r| r.unwrap())  
}  
                            
fn main() -> Result<(), io::Error> {  
    let args = Args::parse();  
    let entries = match get_files_in_dir(&args.dir)  
    {  
        Some(entries) => entries,  
        None => return Ok(())  
    };  


    let mut wtr = WriterBuilder::new().from_writer(io::stdout());  
    for (idx, row) in read_parquet_dir(&entries.iter().map(|p| p.display().to_string()).collect()).enumerate() {  
        let values: Vec<String> = row.get_column_iter().map(|(_column, value)| value.to_string()).collect();  
        if idx == 0 {  
            wtr.serialize(row.get_column_iter().map(|(column, _value)| column.to_string()).collect::<Vec<String>>())?;  
        }  
        wtr.serialize(values)?;  
        if args.count>0 && idx+1 == args.count {  
            break;  
        }  
    }  
    
    Ok(())  
}  
  • shape_warrior_t
    link
    fedilink
    English
    arrow-up
    4
    ·
    5 days ago

    Not very familiar with the libraries, and BB_C has already given a lot of more detailed feedback, but here are some small things:

    • Even without rewriting get_files_in_dir as a single chain of method calls, you can still replace
      if dir.is_err() {  
          return None  
      };
      // use dir.unwrap()
      
      with
      let Ok(dir) = dir else {
          return None;
      };
      // use dir
      
      In general, if you can use if let or let else to pattern match, prefer that over unwrapping – Clippy has a lint relating to this, actually.
    • Although the formatting doesn’t seem horrible, it doesn’t seem like Rustfmt was used on it. You might want to make use of it for slightly more consistent formatting. I mainly noticed the lack of trailing commas, and the lack of semicolons after return None in get_files_in_dir.