Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

34 add propper logging #36

Merged
merged 7 commits into from
Dec 19, 2024
Merged

34 add propper logging #36

merged 7 commits into from
Dec 19, 2024

Conversation

cwmeijer
Copy link
Contributor

@cwmeijer cwmeijer commented Dec 18, 2024

(fixes #34)
see this pr's edits to the readme.md for an overview of the added functionality

@cwmeijer cwmeijer marked this pull request as ready for review December 18, 2024 15:09
@cwmeijer cwmeijer requested a review from rogerkuou December 18, 2024 15:27
Copy link

@rogerkuou rogerkuou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @cwmeijer, thanks for the implementation! Most of thing look awesome.
I have one major concern about the performance of log_array. The others comments are all minor.

import pandas as pd


def setup_logger(name: str = None, level=logging.INFO):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the default level here is is only used for stdout_sh. When looking at the API, my intuitive feeling would be this indicates the default level of the output logger root_logger. I suggest either rename this to stdout_level, or pass this variable to the default level of root_logger. (I prefer the second)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My idea was to have logging to files for debug and info always on, and only have the level of stdout configurable. So in that sense, the second options isn't possible. I think the first option is an excellent suggestion.


# Configure logger and add handlers
root_logger = logging.getLogger()
root_logger.setLevel(logging.DEBUG)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following the suggestion on level, proposing this.

Suggested change
root_logger.setLevel(logging.DEBUG)
root_logger.setLevel(level)

Comment on lines +17 to +18
path = Path('log')
path.mkdir(exist_ok=True, parents=True)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider expose the path of log file as an option?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that, but I think that adds unnecessary complexity for now. If you think we need it (now or soon), especially from the plugin side, I'll go for it of course.

logger.info(f"{task_name} finished in {duration:.4f} seconds")


def log_array(data: np.ndarray, logger, array_name:str="array") -> None:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this function, despite the level of logger, all calculations be called. In case a large data, this will significantly slow down you workflow.

Maybe we should consider check the level of logger?

IMO maybe it's better to move the calculation part out of this function and pass them as a dictionary in here. I think when calling a logging function, the computation effort is usally neglected. The user should be aware what calculation is actually done.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I was focusing on gathering all data that we'd need to debug a user's problem, I didn't consider performance enough.

@@ -19,7 +20,7 @@ def image_path_to_mask_path(image_path: Path) -> Path:
self.masks = [str(image_path_to_mask_path(Path(p))) for p in self.images][:self.limit]
non_existing_masks = [p for p in self.masks if Path(p).exists() == False]
if non_existing_masks:
print(f"{len(non_existing_masks)} of a total of {len(self.masks)} masks not found.")
logging.getLogger().warning(f"{len(non_existing_masks)} of a total of {len(self.masks)} masks not found.")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

depends on how we would like this function envolve, maybe consider initiate a logger to collect all the logs? Like you did in other modules.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch! thanks!

@cwmeijer cwmeijer merged commit 8689614 into main Dec 19, 2024
11 checks passed
@cwmeijer cwmeijer deleted the 34-add-propper-logging branch December 19, 2024 15:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add logging
2 participants